Whilst we wait for the arrival of the semantic web, it seems to me that search engines are not performing nearly as well as they might. When a web crawler examines a website, it can only form a very rudimentary view of what the important content actually is.
Today’s invention is a user generated salience measure.
Each time a webpage is loaded into a browser, a visitor would be asked, with a certain probability, whether they wanted to answer questions about the page (in return for entry into e.g. a prize draw).
If the response were positive, a user would be invited to click, in order, the five most important parts of the page (or the five most annoying ones).
These data would be stored over time, suitably encoded within the page itself. This would allow the possibility of automatically reconfiguring both the nature and structure of the content.
More importantly, the stored information would be read by crawlers visiting the page and used to help index its content more effectively (by weighting the words in the index according to the significance accorded them by users).