Archive for April, 2006

Google is strip-mining web search

Sunday, April 23rd, 2006

Google has lately been trying to find new ways to make money off its technology. The AdWords API will soon be metered. Some speculate that the Maps API will follow.

Well, Google’s gotta do something, because right now it’s pissing in its own pool with AdWords and AdSense. Someday soon, the results for many high-value keyword searches will consist mostly of highly search-engine optimized crap carrying AdSense payloads, and the whole value of Google search will be destroyed.

Google’s PageRank is its strength and its Achilles heel. The original idea of weighting search results toward things that people have linked to with matching keywords was a smart hack, and did a good job of leveraging the informational democracy of the Web. But it also left Google wide open for other people’s smart hacks, like Google-bombing. By using PageRank algorithms to determine relevance, Larry Page and Sergey Brin dodged the bullet of having to actually do the heavy lifting of categorization and entity extraction from content to figure out what it’s about. But they also made it possible to game the system.

So, for example, if I wanted to create a perfect Google sandtrap of a site, I would build a technology that watched the Google AdWord auctions, detected the highest value keywords, and then automatically created AdSense (and other context-sensitive advertising) enabled pages optimized for that keyword. At the same time, I’d syndicate links to that site with the keyword to other domains so that they could be picked up by Google, raising the new site’s page rank. Just a little content to make the site legitimate, and a little care about the use of cross-linking, and all of a sudden I’ve created revenue for myself by lowering the value of Google’s searches.

Already, the amount of irrelavent content and spam being yielded by general search engines is being seen as an opportunity for “vertical” search companies, who are seeking to provide search tools that provide high-quality results for very specific topic sets. Call it vertical search, categorized search, community search–these are all about getting better results for searches based on some sort of inherent filtering of result domains.

Google is in a tough spot long-term with click-based advertising. They have to be careful about radically altering the ecosystem they’ve created with their search engine, but they also have to figure out how to filter out the inbound links from automated “spamblogs” and other robo-sites that exist merely to game the PageRank system. Otherwise, their revenue is going to start declining, and will eventually crash because of overgrazing by search-engine marketers.

XML-RPC turns 8

Saturday, April 15th, 2006

XML-RPC turns 8

As Ed Cone points out in his very first post on his new official Ziff blog, and as Dave Winer proudly announces on his own blog, “Today is the 8th birthday of XML-RPC.” So, Web Services are now an eight-year-old technology. Sort of.
XML-RPC was the beginning of it all. The question is, has it really come that far? With the rapidly-emerging alphabet soup of the WS-I and OASIS Web Services working groups, and all of the proposed specs and working implementations, we’ve certainly got a more complex set of Web services interfaces to call upon. In some cases, that may not be a good thing.

Sure, we have something approaching interoperability for many of the Web services implementations out there now. But most Web services are just as insecure right now (despite the WS-Security spec) as XML-RPC is, and the only fix at the moment is a hardware solution…which could do the same for XML-RPC. The main advantage the WS-Basic standard set has is the UDDI directory service for finding (and managing the lifecycle of) Web services, and the Web Services Definition Language for accellerating the development cycle required to connect to Web services.
XML-RPC was kicked out the door by Dave Winer while SOAP hung in a political miasma. It is intended for simple interactions between remote applications. But the structure of XML-RPC calls can become unwieldly at times because of its simplicity. SOAP solves some of those problems, but it introduces other problems. XML-RPC was designed to map well to object-oriented programming models, whereas SOAP…. well, it doesn’t.

Let My Process Go: Avoiding death by enterprise architecture

Saturday, April 15th, 2006

The other day, I wrote about the fine line between being agile and being a doormat. There is, of course, a flip-side to that equation: there’s a fine line between having a sustainable process and having no pulse whatsoever.

Way back in the early double-aughts, many pundits joyously declared that “Internet time” had killed the old-school waterfall methodology (also known as the “water torture methodology”) of software development, and that the needs of the post Millenium-Bug enterprise would drive everyone toward a world of “release early and often”. The bad old days of having multi-million dollar data warehouse projects collapse when it was discovered that the requirements had changed six months into the dev cycle were declared over, and all these great new modern development tools were just going to make code fly out the door and deploy itself.

Well, reports of the death of the waterfall methodology have been greatly exaggerated.

SD Times reports that agile software development methodologies are only now finally getting traction. According to a Forrester survey cited in the story, 14 percent of enterprises (and I’d like to see how they defined “enterprises” here) in the US and Europe have adopted agile methodologies, while another 19 percent are getting around to it.

I guess developers in that 19 percent have to clear their current dev queues first before they have time to change methodologies. And as for everyone else…I’m guessing they aren’t in any rush to give up the power that their well-documented process the gives them to control the rate of change at their companies–which can probably be described as “glacial.”

That’s because a lot of organizations have re-invented the waterfall methodology as an “enterprise architecture process”. With the goal of creating a manageable, unified IT architecture–and controlling costs–they’ve created a new additional structure that application projects have to pass through to be vetted before they can even see prototype. The results are the same as the old waterfall methodology — requirements gathering and other supporting documents consume most of the up-front time, and regardless of how interactive the process is with the users on the back-end, agility gets crushed under the weight of the process.

The results are predictable–they’re the same as what happened when client-server computing was born. Instead of getting included in the business process, the enterprise architecture process becomes something that business managers try to circumvent through skunkworks projects of their own, or solutions hobbled together from existing assets, or outright outsourcing of the whole thing. And that results in crapware, which defeats the whole point of having an enterprise architecture in the first place.

Maybe that’s why Borland and others see such a huge potential market in life cycle management and change management tools–because there are so many companies struggling to drag themselves out from under their
application life cycles. Somehow, people hope that they can automate themselves out of crapware hell by tracking things better. But that is, as my sixth grader would say, “stinkin’ thinkin’.” Measuring crap just tells you how much crap you have.

The key is uncoupling tactical business requirements from strategic architecture. We’ve been through this routine before, with n-tier client server; there’s nothing new about this architectural approach other than what protocol it’s riding over.

Service-oriented architectures partially fix the problem. By turning new application projects into “mashups” of services, you can focus on architecture where it matters–on the back end–and decentralize the process for building the presentation logic. That leaves you free to be innovative and agile on the side where it matters the most.

Sure, there’s a place for LCM tools: on the services side. With so much variation in the implementation of web services standards, you need LCM tools to keep track of what WS-whatever versioning you’ve got available. You’ve also got to track usage of services and be sure that changes you make won’t break any of those decentralized front-ends (at least not without ample warning).

Whoops

Thursday, April 13th, 2006

I had a little bit of a database glitch caused by…er…user error. Unfortunately, I’m on the road, so I don’t have my backup handy.