Google faces hundreds of new rivals

Google and its search engine competitors may be facing hundreds - or thousands - of new rivals.

Share

Google and its search engine competitors may be facing hundreds - or thousands - of new rivals.

That's because the search engine project that putative Google competitor Wikia is working on will enter the open source domain.

This will drastically reduce the cost for just about anyone to make a search engine, said Gil Penchina, CEO of Wikia. Instead of paying millions of dollars to index the Web, create the software to build a search page, a filter for empty or spam pages, and an algorithm to calculate and rank findings, new search companies will find these items free online - thanks to the open source and free software communities.

"In search, it still costs about US$5m to $10m to build a site," said Penchina. "We want to make it possible for anyone to build a search site for $500. We don't view Google as the competition, we view cost as the competition."

The project, which was started by Wikipedia co-founder Jimmy Wales, consists of four components, the indexing of the web, developing a search engine application, an algorithm, and using people to help filter sites and rank results.

One of the most expensive components of a search engine is the effort needed to index the web. Companies have to buy servers and software to crawl the web looking at what's on every page, in order to create a comprehensive list of what's on the web.

"Your average search start-up will spend over $1 million buying servers and collecting data. That's bad for a couple of reasons. One is that everyone spends millions of dollars doing what is essentially the same work, which is like writing an encyclopedia all over again. Well, what if all of that data was available over the GNU Free Documentation License, which is the free content licence? So our goal is to make a crawl of the web publicly available," said Penchina.

The cost of indexing the web is one of the main hurdles to starting a search engine, and for-profit companies have raised the bar year after year by indexing the web more and more often. It used to be catalogued once a week, or once a day. Now it's once an hour, or even more often. The high cost of running these crawls has become a competitive weapon.

Find your next job with computerworld UK jobs