Thursday, October 21, 2004

Keith's unofficial history finding good information on the internet:

Some friends are in a lively discussion about a knowledge system we're building. How do you help people find the "best" information? One idea is a group of experts who review posted knowledge and sort out the best. Another is to trust the "wisdom of the crowds."

I wrote the following in that discussion. My friend Rob promised to blog about it if I'd post to my blog. Rob, I'm watching!

Concerning the idea of an "expert editorial group", remember the example of internet search engines/directories. We can debate the similarity, but I see this as a good parallel to our dream of an explosion of info in knowledge system and the need to get people connected to the best stuff.

Keith's unofficial history finding good information on the internet:

1. One of the first search/directories was Yahoo. There were others, but Yahoo is indicative of the directory approach. Yahoo's "search" really was a search across their human-approved directory listings.

Yahoo was started by a couple of guys who would look at websites and list them in categories they found best. This is equivalent to an expert editorial committee, finding the "best" by some definition, assigning it to categories, etc.

Soon Yahoo was the most successful thing out there and the two guys began to enlist others. That is, the expert editor committee had to grow. They changed the process so that you listed yourself, in a category but an expert would look at your site and approve the listing before it became public. Eventually, success overtook their ability to keep up and they instituted a priority fee-for-listing service that would get you listed within 48 hours if you paid. Otherwise, they would eventually look at your site but it might be 2-4 months.

Yahoo's success overwhelmed their ability to manage using an expert editor approach.

2. The next round of search engines were artificial intelligence geniuses. AltaVista was an early one. There were many others with varying AI analysis routines. The idea was to replace the expert editor committee with expert AI systems. They were the best thing going. You could submit your website URL. In the early days, within a few days your site could be found in the search engine. Later it began to take a few weeks, with a pay-for-priority service instituted if you wanted earlier consideration.

This spawned an amazing tug-of-war. AI algorithms are pretty smart, but humans are smarter. Humans would reverse-engineer the AI algorithm of the top search engines (AltaVista, Excite, Northern Light, etc.) They would then configure their web pages to get the top listings, not because they were the best pages, but because their creators understood how to fool the AI-based search engine results. Then the search engine guys would change their AI algorithms, and the dance would begin again. Break the algorithm, improve the algorithm, break the improved algorithm, improve the broken improved algorithm, etc.

The masses of users began to lose faith in AI searches because the masses want the best information to be listed at the top of the search results.

3. The current round of search engines, particularly Google, used a more populist approach -- let the people vote on the best pages, with one link equalling one vote. Google developed the PageRank algorithm that basically counted the number of other webpages linking to a certain webpage. The pages with the greatest number of hyperlinks TO them were the "best" and listed at the top of the search results.

This was a great example of trusting the "wisdom of crowds", or more modernly called "swarm theory."

The Google PageRank algorithm was more difficult to spoof. There was nothing you could do to your page/site that would improve the ranking. You had to get others to link to your page/site to improve the ranking. But then human intelligence won over even this populist algorithm and people learned a year or two ago to force their sites to rank high in Google even though they didn't have the most links from the general population. Several months ago, Google changed their algorithm to reduce the false high placements. The new dance has begun, this time between Google engineers and human search engine optimizers.

So, it seems to me that initially expert editor committees work well because human intelligence is better than the best AI at present, but over time, we need to implement peer-review or swarm-theory or wisdom-of-the-crowd approaches. They have proven superior in the search engine wars.

There are two separate issues: editorial comments and best ideas. Editorial comments are what Amazon does, improving knowledge by allowing others to add to it. Best ideas is what Google's algorithm does, allowing people to identify the best solutions so that we are offering the best ideas "above the fold", at the top of a search or browse listing.

Human intelligence is superior, but capacity is limited. So let's blend a solution that maximizes human intelligence where possible, to both enhance knowledge through comments and find knowledge through the "wisdom of the crowds".


At 5:36 PM, Blogger rob said...

Done and done.


Post a Comment

<< Home