Google is strip-mining web search
Sunday, April 23rd, 2006Google has lately been trying to find new ways to make money off its technology. The AdWords API will soon be metered. Some speculate that the Maps API will follow.
Well, Google’s gotta do something, because right now it’s pissing in its own pool with AdWords and AdSense. Someday soon, the results for many high-value keyword searches will consist mostly of highly search-engine optimized crap carrying AdSense payloads, and the whole value of Google search will be destroyed.
Google’s PageRank is its strength and its Achilles heel. The original idea of weighting search results toward things that people have linked to with matching keywords was a smart hack, and did a good job of leveraging the informational democracy of the Web. But it also left Google wide open for other people’s smart hacks, like Google-bombing. By using PageRank algorithms to determine relevance, Larry Page and Sergey Brin dodged the bullet of having to actually do the heavy lifting of categorization and entity extraction from content to figure out what it’s about. But they also made it possible to game the system.
So, for example, if I wanted to create a perfect Google sandtrap of a site, I would build a technology that watched the Google AdWord auctions, detected the highest value keywords, and then automatically created AdSense (and other context-sensitive advertising) enabled pages optimized for that keyword. At the same time, I’d syndicate links to that site with the keyword to other domains so that they could be picked up by Google, raising the new site’s page rank. Just a little content to make the site legitimate, and a little care about the use of cross-linking, and all of a sudden I’ve created revenue for myself by lowering the value of Google’s searches.
Already, the amount of irrelavent content and spam being yielded by general search engines is being seen as an opportunity for “vertical” search companies, who are seeking to provide search tools that provide high-quality results for very specific topic sets. Call it vertical search, categorized search, community search–these are all about getting better results for searches based on some sort of inherent filtering of result domains.
Google is in a tough spot long-term with click-based advertising. They have to be careful about radically altering the ecosystem they’ve created with their search engine, but they also have to figure out how to filter out the inbound links from automated “spamblogs” and other robo-sites that exist merely to game the PageRank system. Otherwise, their revenue is going to start declining, and will eventually crash because of overgrazing by search-engine marketers.