On the fifth floor of the University of Washington's Paul Allen Center, associate professor Oren Etzioni is demonstrating a new search engine that could change the way we scour the Internet. Dubbed KnowItAll, it's what Etzioni calls an information carnivore—it feeds off commercial search engines, such as Google and Yahoo!, and spits out answers to specific questions.
KnowItAll learns about a subject by building a database of facts. Etzioni and a dozen more researchers download tens of thousands of Web pages and issue a query, such as "Seattle." KnowItAll then extracts all facts pertaining to Seattle, collecting about 12 per second, and stores them in a database. When a user asks KnowItAll a question, such as, "What is the population of Seattle?" the search engine pulls the answer from its reservoir of information. "The next goal is to collect 10 million facts by the end of the year, and in two years we'll be able to put KnowItAll on the Web in a limited form," says Etzioni.
Rarely seen as an innovator of search engines, Seattle has a long history of advancing the technology. As Microsoft prepares to enter the search business with its own software, and with UW graduates working at industry leader Google, the university's legacy takes on particular importance.
In 1994, companies were staking their claim in the search market faster than you could say venture capitalism, and two of the first search engines, MetaCrawler and WebCrawler, were hatched in UW's computer science and engineering department. WebCrawler, designed by Brian Pinkerton when he was a graduate student, was released to the public in 1994. A year later, America Online gobbled it up. Since the university did not fund the project, Pinkerton pocketed "a nice chunk of cash" and was given a job at AOL. The search engine changed hands again in 1997 when AOL sold it to Excite. After Excite went belly-up in 2001, InfoSpace purchased WebCrawler.
In 1995, Etzioni and then–graduate student Erik Selberg created MetaCrawler, which queries 12 search engines and combs the results to produce a more relevant list. MetaCrawler was originally licensed by a UW startup called NetBot. Selberg, Etzioni, and the university were each rewarded with stock in NetBot, which was later swallowed by Excite in 1997 for $35 million in stock. InfoSpace purchased MetaCrawler two years later and continues to pay UW royalties. As of March 1, the search engine has generated $106,000 for the university, said James Severson, vice provost of UW's technology transfer office. In 2001, UW had 414 licenses generating $25 million in income, according to Ed Lazowska, chair of the computer science department. Both MetaCrawler and WebCrawler operate today.
After Google entered the search-engine market in 1998, it quickly became the leading site. But this year marked a new, more competitive phase in the search market as both Yahoo! and MSN broke ties with Google, whose technology they licensed for use on their own sites, and developed software of their own. In addition, several startup services have cropped up in response to Google's problems with webmasters who try to circumvent its page-ranking method of counting links to a given site.
To fight off competition, Google is fortifying with numerous researchers and engineers, including 60 Ph.D.s—10 of whom cut their teeth at UW. The Mountain View, Calif., company also employs a dozen lesser-degreed Huskies. "Google has been the most aggressive recruiter of students in the past three years," said Lazowska.
Selberg and Pinkerton agree that this round of search-engine competition will be different than before, for one big reason: money. "The cost to enter the search-engine space is enormous," says Pinkerton, who owns a search consulting business in the Bay Area. "When I did it, it was as a grad student with a top PC and time to program. You can't do that now." Says Selberg, a program manager with the MSN search team: "The hardware and infrastructure is a significant investment. It's not something easy to do with a bootstrap operation anymore."
On April 29, Google filed for a public stock offering and announced it would raise more than $2 billion—a nice-sized war chest to fend off competitors. But startups and brand-name companies are willing to spend money, too, to capture a portion of paid-placement advertising revenue. The advertising revenue generated by search queries will reach nearly $7 billion in three years, says Piper Jaffray analyst Safa Rashtchy. Everyone wants a piece of the pie, including Microsoft.
Microsoft is working on integrating several search features into its next operating system, code named Longhorn. Eric Brill told Technology Review magazine that people should be able to enter a question and be supplied with an answer, rather than trying to craft search phrases with words such as and, or, and not. Brill and others have been testing a program called AskMSR for the past year. AskMSR can answer simple fact-based questions and is just one of the technologies the Redmond company is testing.
Also a product of search research at UW, AskMSR builds on principles previously employed by an internal university question-and-answer system called Mulder, says Etzioni. As for UW's new technology, KnowItAll, it likely will follow the same route into the commercial market as WebCrawler, MetaCrawler, and Mulder, Etzioni says. And although KnowItAll is in early development, at least one search contender—Yahoo!—has shown interest.