Search: Are You Feeling Lucky?
I’m at the part where Google is about to snag Yahoo’s search business away from Inktomi in the early 2000’s. That got me thinking, when was search technology first developed? Who had the first site? AltaVista, AOL, Yahoo, AskJeeves, LookSmart?
Well, my friends today is your lucky day. That’s because I’ve shamelessly scrapped the history of search from none other than SearchEngineHistory.com. No one should have this much fun. I know, you’re not worthy!
By December of 1993, three full fledged bot fed search engines had surfaced on the web: JumpStation, the World Wide Web Worm, and the Repository-Based Software Engineering (RBSE) spider.
Excite came from the project Architext, which was started by in February, 1993 by six Stanford undergrad students. They had the idea of using statistical analysis of word relationships to make searching more efficient. They were soon funded, and in mid 1993 they released copies of their search software for use on web sites.
Excite was bought by a broadband provider named @Home in January, 1999 for $6.5 billion, and was named Excite@Home. In October, 2001 Excite@Home filed for bankruptcy. InfoSpace bought Excite from bankruptcy court for $10 million.
When Tim Berners-Lee set up the web he created the Virtual Library, which became a loose confederation of topical experts maintaining relevant topical link lists.
The EINet Galaxy web directory was born in January of 1994. It was organized similar to how web directories are today. The biggest reason the EINet Galaxy became a success was that it also contained Gopher and Telnet search features in addition to its web search feature. The web size in early 1994 did not really require a web directory; however, other directories soon did follow.
In April 1994 David Filo and Jerry Yang created the Yahoo! Directory as a collection of their favorite web pages. As their number of links grew they had to reorganize and become a searchable directory. What set the directories above The Wanderer is that they provided a human compiled description with each URL. As time passed and the Yahoo! Directory grew Yahoo! began charging commercial sites for inclusion. As time passed the inclusion rates for listing a commercial site increased. The current cost is $299 per year. Many informational sites are still added to the Yahoo! Directory for free.
Open Directory Project
In 1998 Rich Skrenta and a small group of friends created the Open Directory Project, which is a directory which anybody can download and use in whole or part. The ODP (also known as DMOZ) is the largest internet directory, almost entirely ran by a group of volunteer editors. The Open Directory Project was grown out of frustration webmasters faced waiting to be included in the Yahoo! Directory. Netscape bought the Open Directory Project in November, 1998. Later that same month AOL announced the intention of buying Netscape in a $4.5 billion all stock deal.
Google offers a librarian newsletter to help librarians and other web editors help make information more accessible and categorize the web. The second Google librarian newsletter came from Karen G. Schneider, who is the director of Librarians’ Internet Index. LII is a high quality directory aimed at librarians. Her article explains what she and her staff look for when looking for quality credible resources to add to the LII. Most other directories, especially those which have a paid inclusion option, hold lower standards than selected limited catalogs created by librarians.
The Internet Public Library is another well kept directory of websites.
Due to the time intensive nature of running a directory, and the general lack of scalability of a business model the quality and size of directories sharply drops off after you get past the first half dozen or so general directories. There are also numerous smaller industry, vertically, or locally oriented directories. Business.com, for example, is a directory of business websites.
Looksmart was founded in 1995. They competed with the Yahoo! Directory by frequently increasing their inclusion rates back and forth. In 2002 Looksmart transitioned into a pay per click provider, which charged listed sites a flat fee per click. That caused the demise of any good faith or loyalty they had built up, although it allowed them to profit by syndicating those paid listings to some major portals like MSN. The problem was that Looksmart became too dependant on MSN, and in 2003, when Microsoft announced they were dumping Looksmart that basically killed their business model.
In March of 2002, Looksmart bought a search engine by the name of WiseNut, but it never gained traction. Looksmart also owns a catalog of content articles organized in vertical sites, but due to limited relevancy Looksmart has lost most (if not all) of their momentum. In 1998 Looksmart tried to expand their directory by buying the non commercial Zeal directory for $20 million, but on March 28, 2006 Looksmart shut down the Zeal directory, and hope to drive traffic using Furl, a social bookmarking program.
Search Engines vs Directories:
All major search engines have some limited editorial review process, but the bulk of relevancy at major search engines is driven by automated search algorithms which harness the power of the link graph on the web. In fact, some algorithms, such as TrustRank, bias the web graph toward trusted seed sites without requiring a search engine to take on much of an editorial review staff. Thus, some of the more elegant search engines allow those who link to other sites to in essence vote with their links as the editorial reviewers.
Unlike highly automated search engines, directories are manually compiled taxonomies of websites. Directories are far more cost and time intensive to maintain due to their lack of scalability and the necessary human input to create each listing and periodically check the quality of the listed websites.
General directories are largely giving way to expert vertical directories, temporal news sites (like blogs), and social bookmarking sites (like del.ici.ous). In addition, each of those three publishing formats I just mentioned also aid in improving the relevancy of major search engines, which further cuts at the need for (and profitability of) general directories.
Brian Pinkerton of the University of Washington released WebCrawler on April 20, 1994. It was the first crawler which indexed entire pages. Soon it became so popular that during daytime hours it could not be used. AOL eventually purchased WebCrawler and ran it on their network. Then in 1997, Excite bought out WebCrawler, and AOL began using Excite to power its NetFind. WebCrawler opened the door for many other services to follow suit. Within 1 year of its debuted came Lycos, Infoseek, and OpenText.
Lycos was the next major search development, having been design at Carnegie Mellon University around July of 1994. Michale Mauldin was responsible for this search engine and remains to be the chief scientist at Lycos Inc.
On July 20, 1994, Lycos went public with a catalog of 54,000 documents. In addition to providing ranked relevance retrieval, Lycos provided prefix matching and word proximity bonuses. But Lycos’ main difference was the sheer size of its catalog: by August 1994, Lycos had identified 394,000 documents; by January 1995, the catalog had reached 1.5 million documents; and by November 1996, Lycos had indexed over 60 million documents — more than any other Web search engine. In October 1994, Lycos ranked first on Netscape’s list of search engines by finding the most hits on the word ‘surf.’.
Infoseek also started out in 1994, claiming to have been founded in January. They really did not bring a whole lot of innovation to the table, but they offered a few add on’s, and in December 1995 they convinced Netscape to use them as their default search, which gave them major exposure. One popular feature of Infoseek was allowing webmasters to submit a page to the search index in real time, which was a search spammer’s paradise.
AltaVista debut online came during this same month. AltaVista brought many important features to the web scene. They had nearly unlimited bandwidth (for that time), they were the first to allow natural language queries, advanced searching techniques and they allowed users to add or delete their own URL within 24 hours. They even allowed inbound link checking. AltaVista also provided numerous search tips and advanced search features.
Due to poor mismanagement, a fear of result manipulation, and portal related clutter AltaVista was largely driven into irrelevancy around the time Inktomi and Google started becoming popular. On February 18, 2003, Overture signed a letter of intent to buy AltaVistafor $80 million in stock and $60 million cash. After Yahoo! bought out Overture they rolled some of the AltaVista technology into Yahoo! Search, and occasionally use AltaVista as a testing platform.
The Inktomi Corporation came about on May 20, 1996 with its search engine Hotbot. Two Cal Berkeley cohorts created Inktomi from the improved technology gained from their research. Hotwire listed this site and it became hugely popular quickly.
In October of 2001 Danny Sullivan wrote an article titled Inktomi Spam Database Left Open To Public, which highlights how Inktomi accidentally allowed the public to access their database of spam sites, which listed over 1 million URLs at that time.
Although Inktomi pioneered the paid inclusion model it was nowhere near as efficient as the pay per click auction model developed by Overture. Licensing their search results also was not profitable enough to pay for their scaling costs. They failed to develop a profitable business model, and sold out to Yahoo! for approximately $235 million, or $1.65 a share, in December of 2003.
Ask.com (Formerly Ask Jeeves):
In April of 1997 Ask Jeeves was launched as a natural language search engine. Ask Jeeves used human editors to try to match search queries. Ask was powered by DirectHit for a while, which aimed to rank results based on their popularity, but that technology proved to easy to spam as the core algorithm component. In 2000 the Teoma search engine was released, which uses clustering to organize sites by Subject Specific Popularity, which is another way of saying they tried to find local web communities. In 2001 Ask Jeeves bought Teoma to replace the DirectHit search technology.
Jon Kleinberg’s Authoritative sources in a hyperlinked environment [PDF] was a source of inspiration what lead to the eventual creation of Teoma. Mike Grehan’s Topic Distillation [PDF] also explains how subject specific popularity works.
On Mar 4, 2004, Ask Jeeves agreed to acquire Interactive Search Holdings for 9.3 million shares of common stock and options and pay $150 million in cash. On March 21, 2005 Barry Diller’s IAC agreed to acquire Ask Jeeves for 1.85 billion dollars. IAC owns many popular websites like Match.com, Ticketmaster.com, and Citysearch.com, and is promoting Ask across their other properties. In 2006 Ask Jeeves was renamed to Ask, and they killed the separate Teoma brand.
AllTheWeb was a search technology platform launched in May of 1999 to showcase Fast’s search technologies. They had a sleek user interface with rich advanced search features, but on February 23, 2003, AllTheWeb was bought by Overture for $70 million. After Yahoo! bought out Overture they rolled some of the AllTheWeb technology into Yahoo! Search, and occasionally use AllTheWeb as a testing platform.
Meta Search Engines
Most meta search engines draw their search results from multiple other search engines, then combine and rerank those results. This was a useful feature back when search engines were less savvy at crawling the web and each engine had a significantly unique index. As search has improved the need for meta search engines has been reduced.
Hotbot was owned by Wired, had funky colors, fast results, and a cool name that sounded geeky, but died off not long after Lycos bought it and ignored it. Upon rebirth it was born as a meta search engine. Unlike most meta search engines, Hotbot only pulls results from one search engine at a time, but it allows searchers to select amongst a few of the more popular search engines on the web. Currently Dogpile, owned by Infospace, is probably the most popular meta search engine on the market, but like all other meta search engines, it has limited market share.
One of the larger problems with meta search in general is that most meta search engines tend to mix pay per click ads in their organic search results, and for some commercial queries 70% or more of the search results may be paid results. I also created Myriad Search, which is a free open source meta search engine without ads.
Pay Per Click
Pay per click ads allow search engines to sell targeted traffic to advertisers on a cost per click basis. Typically pay per click ads are keyword targeted, but in some cases, some engines may also add in local targeting, behavioral targeting, or allow merchants to bid on traffic streams based on demographics as well.
Pay per click ads are typically sold in an auction where the highest bidder ranks #1 for that keyword. Some engines, like Google and Microsoft, also factor ad clickthrough rate into the click cost. Doing so ensures their ads get clicked on more frequently, and that their advertisements are more relevant. A merchant who writes compelling ad copy and gets a high CTR will be allowed to pay less per click to receive traffic.
In 1996 an 18-year-old college dropout named Scott Banister came up with the idea of charging search advertisers by the click with ads tied to the search keyword. He promoted it to the likes of Yahoo!, but their (lack of) vision was corrupted by easy money, so they couldn’t see the potential of search. The person who finally ran with Mr. Banister’s idea was IdeaLab’s Bill Gross.
Overture (Formerly GoTo)
Overture, the pioneer in paid search, was originally launched by Bill Gross under the name GoTo in 1998. His idea was to arbitrage traffic streams and sell them with a level of accountability. John Battelle’s The Search has an entertaining section about Bill Gross and the formation of overture. John also published that section on his blog.
“The more I [thought about it], the more I realized that the true value of the Internet was in its accountability,” Gross tells me. “Performance guarantees had to be the model for paying for media.”
Gross knew offering virtually risk-free clicks in an overheated and ravenous market ensured GoTo would takeoff. And while it would be easy to claim that GoTo worked because of the Internet bubble’s ouroboros-like hunger for traffic, the company managed to outlast the bust for one simple reason: it worked.
While Overture was wildly successful, it had two major downfalls which prevented them from taking Google’s market position:
- Destination Branding: Google allowed itself to grow into a search destination. Bill Gross decided not to grow Overture into one because he feared that would cost him distribution partnerships. When AOL selected Google as an ad partner, in spite of Google also growing out their own brand, that pretty much was the nails in the coffin for Overture being the premiere search ad platform.
- Ad Network Efficiency: Google AdWords factors ad clickthrough rate into their ad costs, which ensures higher relevancy and more ad network efficiency. As of September 2006 the Overture platform (then known as Yahoo! Search Marketing) still did not fix that problem.
Those two faults meant that Overture was heavily reliant on it’s two largest distribution partners – Yahoo! and Microsoft. Overture bought out AltaVista and AllTheWeb to try to win some leverage, but ultimately they sold out to Yahoo! on July 14, 2003 for $1.63 billion.
Google AdWords launched in 2000. The initial version was a failure because it priced ads on a flat CPM model. Some keywords were overpriced and unaffordable, while others were sold inefficiently at too cheap of a price. In February of 2002, Google relaunched AdWords selling the ads in an auction similar to Overture’s, but also adding ad clickthrough rate in as a factor in the ad rankings.
Affiliates and other web entrepreneurs quickly took to AdWords because the precise targeting and great reach made it easy to make great profits from the comfort of your own home, while sitting in your underwear 🙂
Over time, as AdWords became more popular and more mainstream marketers adopted it, Google began closing some holes in their AdWords product. For example, to fight off noise and keep their ads as relevant as possible, they disallowed double serving of ads to one website. Later they started looking at landing page quality and establishing quality based minimum pricing, which squeezed the margins of many small arbitrage and affiliate players.
On March 20, 2007, Google announced they were beta testing creating a distributed pay per action affiliate ad network. On April 13, 2007 Google announced the purchase of DoubleClick for $3.1 billion.
On March 4, 2003 Google announced their content targeted ad network. In April 2003,Google bought Applied Semantics, which had CIRCA technology that allowed them to drastically improve the targeting of those ads. Google adopted the name AdSense for the new ad program.
AdSense allows web publishers large and small to automate the placement of relevant ads on their content. Google initially started off by allowing textual ads in numerous formats, but eventually added image ads and video ads. Advertisers could chose which keywords they wanted to target and which ad formats they wanted to market.
To help grow the network and make the market more efficient Google added a link which allows advertisers to sign up for AdWords account from content websites, and Google allowed advertisers to buy ads targeted to specific websites, pages, or demographic categories. Ads targeted on websites are sold on a cost per thousand impression (CPM) basis in an ad auction against other keyword targeted and site targeted ads.
Google also allows some publishers to place AdSense ads in their feeds, and some select publishers can place ads in emails.
To prevent the erosion of value of search ads Google allows advertisers to opt out of placing their ads on content sites, and Google also introduced what they called smart pricing. Smart pricing automatically adjusts the click cost of an ad based on what Google perceives a click from that page to be worth. An ad on a digital camera review page would typically be worth more than a click from a page with pictures on it.
Google was secretive about its revenue share since the inception of AdSense, but due to a lawsuit in Italy Google feared they would be stuck disclosing their revenue share, so they decided to do so publicly for good public relations on May 24, 2010. Google keeps 32% while giving publishers 68% of contextual ad revenues. On search ads Google keeps 49% and gives publishers 51%. Some premium publishers are able to negotiate higher rates & custom integration options as well.
Yahoo! Search Marketing
Yahoo! Search Marketing is the rebranded name for Overture after Yahoo! bought them out. As of September 2006 their platform is generally the exact same as the old Overture platform, with the same flaws – ad CTR not factored into click cost, it’s hard to run local ads, and it is just generally clunky.
In 2000 Microsoft launched a keyword driven ad program called keywords, but shut it down after 2 months because they feared it would cannibalize their banner ad revenues.
Microsoft AdCenter was launched on May 3. 2006. While Microsoft has limited marketshare, they intend to increase their marketshare by baking search into Internet Explorer 7. On the features front, Microsoft added demographic targeting and dayparting features to the pay per click mix. Microsoft’s ad algorithm includes both cost per click and ad clickthrough rate.
Microsoft also created the XBox game console, and on May 4, 2006 announced they bought a video game ad targeting firm named Massive Inc. Eventually video game ads will be sold from within Microsoft AdCenter.
Search Engine Optimization
What is SEO?
Search engine optimization is the art and science of publishing information in a format which will make search engines believe that your content satisfies the needs of their users for relevant search queries. SEO, like search, is a field much older than I am. In fact, it was not originally even named search engine optimization, and to this day most people are still uncertain where that phrase came from.
Early search engine optimization consisted mostly of using descriptive file names, page titles, and meta descriptions. As search advanced on the page factors grew more important and then people started trying to aim for specific keyword densities.
One of the big things that gave Google an advantage over their competitors was the introduction of PageRank, which graded the value of a page based on the number and quality of links pointing at it. Up until the end of 2003 search was exceptionally easy to manipulate. If you wanted to rank for something all you had to do was buy a few powerful links and place the words you wanted to rank for in the link anchor text.
Search Gets More Sophisticated
On November 15, 2003 Google began to heavily introduce many more semantic elements into its search product. Researchers and SEO’s alike have noticed wild changes in search relevancy during that update and many times since then, but many searchers remain clueless to the changes.
Search engines would prefer to bias search results toward informational resources to make the commercial ads on the search results appear more appealing. You can see an example of how search can be biased toward commercial or informational resources by playing withYahoo! Mindset.
Curbing Link Spam
On January 18, 2005, Google, MSN, and Yahoo! announced the release of a NoFollow tag which allows blog owners to block comment spam from passing link popularity. People continued to spam blogs and other resources, largely because search engines may still count some nofollow links, and largely because many of the pages they spammed still rank.
Since 2003 Google has came out with many advanced filters and crawling patterns to help make quality editorial links count more and depreciate the value of many overtly obvious paid links or other forms of link manipulation.
Historical, Editorial, & Usage Data
Older websites may be given more trust in relevancy algorithms than newer websites (just existing for a period of time is a signal of quality). All major search engines use human editors to help review content quality and help improve their relevancy algorithms. Search engines may factor in user acceptance and other usage data to help determine if a site needs reviewed for editorial quality and to help determine if linkage data is legitimate.
Google has also heavily pushed giving away useful software, tools, and services which allow them to personalize search results based on the searcher’s historical preferences.
Self Reinforcing Market Positions
In many verticals search is self reinforcing, as in a winner take most battle. Jakob Nielsen‘sThe Power of Defaults notes that the top search result is clicked on as often as 42% of the time. Not only is the distribution and traffic stream highly disproportionate, but many people tend to link to the results that were easy to find, which makes the system even more self reinforcing, as noted in Mike Grehan’s Filthy Linking Rich.
A key thing to remember if you are trying to catch up with another website is that you have to do better than what was already done, and significantly enough better that it is comment worthy or citation worthy. You have to make people want to switch their world view to seeing you as an authority on your topic. Search engines will follow what people think.
Hypocrisy in Search
Google engineer Matt Cutts frequently comments that any paid link should have the nofollow attribute applied to it, although Google hypocritically does not place the nofollow attribute on links they buy. They also have placed their ads on the leading Warez site and continued to serve ads on sites that they banned for spamming. Yahoo! Shopping has also been known to be a big link buyer.
Much of the current search research is based upon the view that any form of marketing / promotion / SEO is spam. If that was true, it wouldn’t make sense that Google is teaching SEO courses, which they do.
Google’s corporate history page has a pretty strong background on Google, starting from when Larry met Sergey at Stanford right up to present day. In 1995 Larry Page met Sergey Brin at Stanford.
By January of 1996, Larry and Sergey had begun collaboration on a search engine called BackRub, named for its unique ability to analyze the “back links” pointing to a given website. Larry, who had always enjoyed tinkering with machinery and had gained some notoriety for building a working printer out of Lego™ bricks, took on the task of creating a new kind of server environment that used low-end PCs instead of big expensive machines. Afflicted by the perennial shortage of cash common to graduate students everywhere, the pair took to haunting the department’s loading docks in hopes of tracking down newly arrived computers that they could borrow for their network.
A year later, their unique approach to link analysis was earning BackRub a growing reputation among those who had seen it. Buzz about the new search technology began to build as word spread around campus.
BackRub ranked pages using citation notation, a concept which is popular in academic circles. If someone cites a source they usually think it is important. On the web, links act as citations. In the PageRank algorithm links count as votes, but some votes count more than others. Your ability to rank and the strength of your ability to vote for others depends upon your authority: how many people link to you and how trustworthy those links are.
In 1998, Google was launched. Sergey tried to shop their PageRank technology, but nobody was interested in buying or licensing their search technology at that time.
Winning the Search War
Later that year Andy Bechtolsheim gave them $100,000 seed funding, and Google received $25 million Sequoia Capital and Kleiner Perkins Caufield & Byers the following year. In 1999 AOL selected Google as a search partner, and Yahoo! followed suit a year later. In 2000 Google also launched their popular Google Toolbar. Google gained search market share year over year ever since.
In 2000 Google relaunched their AdWords program to sell ads on a CPM basis. In 2002 they retooled the service, selling ads in an auction which would factor in bid price and ad clickthrough rate. On May 1, 2002, AOL announced they would use Google to deliver their search related ads, which was a strong turning point in Google’s battle against Overture.
In 2003 Google also launched their AdSense program, which allowed them to expand their ad network by selling targeted ads on other websites.
Google used a two class stock structure, decided not to give earnings guidance, and offered shares of their stock in a Dutch auction. They received virtually limitless negative press for the perceived hubris they expressed in their “AN OWNER’S MANUAL” FOR GOOGLE’S SHAREHOLDERS. After some controversy surrounding an interview in Playboy, Google dropped their IPO offer range from $85 to $95 per share from $108 to $135. Google went public at $85 a share on August 19, 2004 and its first trade was at 11:56 am ET at $100.01.
In addition to running the world’s most popular search service, Google also runs a large number of vertical search services, including:
- Google News: Google News launched in beta in September 2002. On September 6, 2006, Google announced an expanded Google News Archive Search that goes back over 200 years.
- Google Book Search: On October 6, 2004, Google launchedGoogle Book Search.
- Google Scholar: On November 18, 2004, Google launched Google Scholar, an academic search program.
- Google Blog Search: On September 14, 2005, Google announced Google Blog Search.
- Google Base: On November 15, 2005, Google announced the launch of Google Base, a database of uploaded information describing online or offline content, products, or services.
- Google Video: On January 6, 2006, Google announced Google Video.
- Google Universal Search: On May 16, 2007 Google began mixing many of their vertical results into their organic search results.
Just Search, We Promise!
Google’s corporate mission statement is:
Google’s mission is to organize the world’s information and make it universally accessible and useful.
However that statement includes many things outside of the traditional mindset of search, and Google maintains that ads are a type of information. This other information includes:
- Email: Google launched Gmail on March 31, 2004, offering search email search and gigabytes of storage space.
- Maps: On October 27, 2004, Google bought Keyhole. On February 8, 2005, Google launched Google Maps.
- Analytics: On March 29, 2005, Google bought Urchin, a website traffic analytics company. Google renamed the service Google Analytics.
- Radio ads: Google bought dMarc Broadcasting on January 17, 2006 .
- Ads in other formats: Google tested magazine ads and newspaper ads.
- Office productivity software: on March 9, 2006, Google bought Writely, an online collaborative document creating and editing software product.
- Calendar: on April 14, 2006, Google launched Google Calendar, which allows you to share calendars with multiple editors and include calendars in web pages.
- Checkout: On June 29, 2006, Google launched Google Checkout, a way to store your personal transaction related information online.
Paying for Distribution
In addition to having strong technology and a strong brand Google also pays for a significant portion of their search market share.
On December 20, 2005 Google invested $1 billion in AOL to continue their partnership and buy a 5% stake in AOL. In February 2006 Google agreed to pay Dell up to $1 billion for 3 years of toolbar distribution. On August 7, 2006, Google signed a 3 year deal to provide search on MySpace for $900 million. On October 9, 2006 Google bought YouTube, a leading video site, for $1.65 billion in stock.
Google also pays Mozilla and Opera hundreds of millions of dollars to be the default search provider in their browsers, bundles their Google Toolbar with software from Adobe and Sun Microsystems, and pays AdSense ad publishers $1 for Firefox + Google Toolbar installs, or up to $2 for Google Pack installs.
Google also builds brand exposure by placing Ads by Google on their AdSense ads and providing Google Checkout to commercial websites.
Google Pack is a package of useful software including a Google Toolbar and software from many other companies. At the same time Google helps ensure its toolbar is considered good and its competitors don’t use sleazy distribution techniques by sponsoringStopBadware.org.
Google’s distribution, vertical search products, and other portal elements give it a key advantage in best understanding our needs and wants by giving them the largest Database of Intentions.
They have moved away from a pure algorithmic approach to a hybrid editorial approach. InApril of 2007, Google started mixing recent news results in their organic search results. After Google bought YouTube they started mixing videos directly in Google search results.
Since the Florida update in 2003 Google has looked much deeper into linguistics and link filtering. Google’s search results are generally the hardest search results for the average webmaster to manipulate.
Matt Cutts, Google’s lead engineer in charge of search quality, regularly blogs about SEO and search. Google also has an official blog and has blogs specific to many of their vertical search products.
Google also helps webmasters understand how Google is indexing their site via Google Webmaster Central. Google continues to add features and data to their webmaster console for registered webmasters while obfuscating publicly available data.
For an informal look at what working at Google looked like from the inside from 1999 to 2005 you might want to try Xooglers, a blog by former Google brand manager Doug Edwards.
Information Retrieval as a Game of Mind Control
In October of 2007 Google attempted to manipulate the public perception of people buying and selling links by announcing that they were going to penalize known link sellers, and then manually editing the toolbar PageRank scores of some well known blogs and other large sites. These PageRank edits did not change search engine rankings or traffic flows, as the PageRank update was entirely aesthetic.
Getting Into Search
Yahoo! was founded in 1994 by David Filo and Jerry Yang as a directory of websites. For many years they outsourced their search service to other providers, considering it secondary to their directory and other content features, but by the end of 2002 they realized the importance and value of search and started aggressively acquiring search companies.
Overture purchased AllTheWeb and AltaVista in 2003. Yahoo! purchased Inktomi in December, 2002, and then consumed Overture in July, 2003, and combined the technologies from the various search companies they bought to make a new search engine. Yahoo! dumped Google in favor of their own in house technology on February 17, 2004.
In addition to building out their core algorithmic search product, Yahoo! has largely favored the concept of social search.
On March 20, 2005 Yahoo! purchased Flickr, a popular photo sharing site. On December 9, 2005, Yahoo! purchased Del.icio.us, a social bookmarking site. Yahoo! has also made a strong push to promote Yahoo! Answers, a popular free community driven question answering service.
On July 2, 2007, Yahoo! launched their behaviorally targeted SmartAds product.
On July 29, 2009, Yahoo! decided to give up on search and signed a 10 year deal to syndicate Bing ads and algorithmic results on their website.
In 1998 MSN Search was launched, but Microsoft did not get serious about search until after Google proved the business model. Until Microsoft saw the light they primarily relied on partners like Overture, Looksmart, and Inktomi to power their search service.
They launched their technology preview of their search engine around July 1st of 2004. They formally switched from Yahoo! organic search results to their own in house technology on January 31st, 2005. MSN announced they dumped Yahoo!’s search ad program on May 4th, 2006.
On June 1, 2009, Microsoft launched Bing, a new search service which changed the search landscape by placing inline search suggestions for related searches directly in the result set. For instance, when you search for credit cards they will suggest related phrases like
- credit card types
- apply for credit cards
- credit cards for bad credit
- advice on credit cards
Microsoft released a Bing SEO guide for Webmasters [PDF] which claimed that the additional keyword suggestions helped pull down search demand to lower listed results when compared against the old results 6 through 10 when using a single linear search result set. Conversely, the Google format tends to concentrate attention on the top few search listings. After extensive eye tracking Gord Hotchkiss named this pattern Google’s Golden Triangle.
One would be foolish to think that there is not a better way to index the web, and a new creative idea is probably just under our noses. The fact that Microsoft is making a large investment into developing a new search technology should be some cause for concern for other major search engines.
Through this course of history many smaller search engines have came and went, as the search industry has struggled to find a balance between profitability and relevancy. Some of the newer search engine concepts are web site clustering, semantics, and having industry specific smaller search engines / portals, but search may get attacked from entirely different angles.
On October 5, 2004 Bill Gross ( the founder of Overture and pioneer of paid search) relaunched Snap as a search engine with a completely transparent business model (showing search volumes, revenues, and advertisers). Snap has many advanced sorting features but it may be a bit more than what most searchers were looking for. People tend to like search for the perceived simplicity, even if the behind the scenes process is quite complex.
Outside of technology there are four other frontiers search is being attacked / commoditized from
- Browser & Software Distribution: Search companies are paying computer manufacturers or software companies an aggregated value of hundreds of millions or billions of dollars each year to bundle their search toolbar with their products.
- Social Search: Large social networks have significant reach and a ton of page views. Yahoo! is rumored to be entertaining buying social network Facebook nearly a billion dollars. Yahoo! has already bought social picture site Flickr and social bookmarking site Del.icio.us. In August of 2006 Google signed a 3 year $900 million contract to provide search and advertising on MySpace.In addition some companies, like Eurekster, are trying to create products which allow groups of webmasters to make topic or community specific search services.
- Content Providers: Some content providers are trying to publish content on their own domains and build off their brand. Some are refusing to be included in search indexes. Some are requiring a kickback to be indexed. Some are unsure of what they want and are choosing to sue search engines, either for further brand exposure, or to gain further negotiation leverage.
- Content Aggregators: Search is just one way of finding information. Via RSS feeds and various other technologies many sites are offering what some people consider persistent search, or a way to access any information about a specific topic as it becomes available. Google also bought YouTube for $1.65 in stock. YouTube consists largely of pirated content which Google can organize and publish ads against based on usage data and other forms of ad targeting.
Search & Legal Issues
In 2005 the DoJ obtained search data from AOL, MSN, and Yahoo!. Google denied the request, and was sued for search data in January of 2006. Google beat the lawsuit and was only required to hand over a small sample of data.
In August of 2006 AOL Research released over 3 months worth of personal search data by 650,000 AOL users. A NYT article identified one of the searchers by name. In 2007 the European Union aggressively probed search companies aiming to limit data retention and maintain searcher privacy rights.
Publishing & Copyright Lawsuits
As more people create content attention is becoming more scarce. Due to The Tragedy of the Commons many publishing businesses and business models will die. Many traditional publishing companies enjoyed the profits enabled by running what was essentially regionally based monopolies. Search, and other forms of online media, allow for better targeting and less wasteful / more efficient business models. Due to growing irrelevancy, a fear of change, and a fear of disintermediation, many traditional publishing companies have fought search.
In an interview by Danny Sullivan, Eric Schmidt stated he thought many of the lawsuits Google face are business deals done in a court room.
In September of 2006 some Belgian newspaper companies won a copyright lawsuit against Google News which makes Belgium judges look like they do not understand how search or the internet work. Some publisher groups are trying to create an arbitrary information access protocol, Agence France Presse (AFP) sued Google to get them to drop their news coverage, and Google paid the AP a licensing fee.
Perfect 10, a pornography company, sued Google for including cached copies of stolen content in their image index, and for allowing publishers to collect income on stolen copyright content via Google AdSense.
Access to Hate Information
In May of 2000 a French judge required Yahoo! to stop providing access to auctions selling Nazi memorabilia.
Pay Per Click & Ad Targeting Lawsuits
In 1999 Playboy sued Excite and Netscape for selling banner impressions sold for searches for Playboy.
Overture sued Google for patent infringement. Just prior to Google’s IPO they settled with Yahoo! (who by then bought out Overture) by giving them 2.7 million shares of class A Google stock.
Geico took Google to court in the US for trademark violation because Google allowed Geico to be a keyword trigger to target competing ads. Geico lost this case on December 15, 2004. Around the same time Google lost a similar French trademark case filed by Louis Vuitton.
Lane’s Gifts sued Google for click fraud, but did not have a strong well put together case. Google’s lawyers pushed them into a class wide out of court settlement of up to $90 million in AdWords credits. The March 2006 settlement aimed to absolve Google of any clickfraud related liabilities back through 2002, when Google launched their pay per click model.
Search User Information
The US government requested that major search companies turned over a significant amount of search related data. Yahoo!, MSN, and AOL gave up search data. The Google blog announced that Google fought the subpoena
In August, Google was served with a subpoena from the U. S. Department of Justice demanding disclosure of two full months’ worth of search queries that Google received from its users, as well as all the URLs in Google’s index.
A judge stated that Google did not have to turn over search usage data.
AOL not only shared information with the government, but AOL research also accidentallymade search records public record.
Search as a Commoditizer
Each search company has its own business objectives and technologies to define relevancy. The three biggest issues search engines are fighting are
- Publishing Rights: All search engines are fighting trying to gain the rights to index quality content. Some of the highest quality content is so expensive to create and market that there is not a business model for openly sharing it on the web. Worse yet, as more and more people get into web publishing the businesses that delay to get their content indexed will have lost authority and distribution the whole time they delayed. This, and the fear of disintermediation, are part of the reason there are so many lawsuits.
- Distribution: The more distribution you have the more profit you can use to leverage the ability to buy more content or make better content partnerships. Also more distribution means that you can potentially send more visitors (and thus profit) to a person who lets you index their content. More usage data may also help engines improve their relevancy algorithms.
- Ad Network Size & Efficiency: Efficient ad networks can afford to pay for more distribution, and thus help the search company gain more content and distribution.
In order to try to lock users in search engines offer things like free email, news search, blogging platform, content hosting, office software, calendars, and feature rich toolbars. In some cases the software or service is not only free, but it is expensive to provide. For example, Google does not profit from Google news, but they had to pay the AP content licensing fees, and hosting Google Video can’t be cheap.
In an attempt to collect more data, better target ads, and improve conversion rates Google offers
- a free analytics product
- free cross platform tracking
- free Wifi internet access in San Francisco and Mountainview
- a free wallet product which makes it quick and easy to buy products
The end goal of search is to commoditize the value of as many brands and markets as possible to keep adding value to the search box. They want to commoditize the value of creating content and increase the value of spreading ideas, the value of attention, and the importance of conversion.
As they make the network more and more efficient they can eat more and more of the profits, which was a large part of the reasoning behind Jakob Nielson’s Search Engines as Leeches on the Web.
Selling Search as an Ecosystem
Because search aims to gain distribution by virtually any means possible the search engines that can do the best job of branding and get people to believe most in their goals / ideals / ecosystem win. Search engines are fighting many ways on this front, but not all of them are even on the web. For example, search engines are trying to attract the smartest minds by sharing research. Google goes so far as offering free pizza!
Google hires people to track webmaster feedback across the web. Matt Cutts frequently blogs about search and SEO because to him it is important for others to see search, SEO, and Google from his perspective. He offers free tips on Google Video in no small part because it was important for Google Video to beat out YouTube for Google to become the default video platform on the web. Once it was clear that Google lost the video battle to YouTube Google decided to buy them.
Beyond just selling their company beliefs and ideology to get people excited about their field, acquire new workers, and get others to act in a way that benefits their business model search engines also provide APIs to make portions of their system open enough that they can leverage the free work of other smart, creative, and passionate people.
Selling search as an ecosystem goes so far that Google puts out endless betas, allowing users to become unpaid testers and advocates of their products. Even if the other search engines matched Google on relevancy they still are losing the search war due to Google’s willingness to take big risks, Google’s brand strength, and how much better Google sells search as an ecosystem.
Google wants to make content ad supported and freely accessible. On October 9, 2006, Google announced they were acquiring YouTube for $1.65 billion in stock. In March, 2007,Viacom sued Google / YouTube for $1 billion for copyright infringement. In 2007 Microsoft pushed against Google’s market position calling Google a copyright infringer (for scanning books) and doing research stating that many of Google’s blogspot hosted blogs are spam.
In 2006 and 2007 numerous social bookmarking and decentralized news sites became popular. Del.icio.us, a popular social bookmarking site, was bought out by Yahoo.Digg.com features fresh news and other items of interest on their home page based on user votes.