|FROM ||David Sugar
|SUBJECT ||Subject: [hangout] A much more complete analysis of the msn search engine...
|From owner-hangout-desteny-at-mrbrklyn.com Mon Nov 17 10:17:29 2003
Received: from www2.mrbrklyn.com (LOCALHOST [127.0.0.1])
by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id hAHFHTjk019854
for ; Mon, 17 Nov 2003 10:17:29 -0500
Received: (from mdom-at-localhost)
by www2.mrbrklyn.com (8.12.3/8.12.3/Submit) id hAHFHTTa019853
for hangout-desteny; Mon, 17 Nov 2003 10:17:29 -0500
X-Authentication-Warning: www2.mrbrklyn.com: mdom set sender to owner-hangout-at-www2.mrbrklyn.com using -f
Received: from localhost.bayonne.dyndns.org (pool-141-153-146-157.mad.east.verizon.net [220.127.116.11])
by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id hAHFHSjk019848
for ; Mon, 17 Nov 2003 10:17:28 -0500
Received: from 192.168.1.103 ([192.168.1.103])
by localhost.bayonne.dyndns.org (8.12.8/8.12.8) with ESMTP id hAHHJV3Z017581
for ; Mon, 17 Nov 2003 12:19:31 -0500
From: David Sugar
Subject: [hangout] A much more complete analysis of the msn search engine...
Date: Mon, 17 Nov 2003 10:35:16 -0500
Reply-To: David Sugar
List: New Yorker GNU Linux Scene
Admin: To unsubscribe send unsubscribe name-at-domian.com in the body to hangout-request-at-www2.mrbrklyn.com
This article appeared in groklaw this morning:
It provides a much more detailed and complete analysis of what is going on
with the msn search engine, and perhaps how it might relate to Microsoft's
interest in google. Yes, it seems there is a very visible and deliberate
hand manipulating msn search results, although in a different way than was
"A Search Engine Mystery Solved"
Monday, November 17 2003 -at- 08:17 AM EST
Slashdot has a followup on the story we first broke about peculiar results on
MSN compared with Google. I read through all the comments carefully, and
picking through the trolls and the shills, I found that some think the
phenomenon may be the result of MSN having first paid-for listings, followed
by all the rest, the real results. However, it turns out there are no real
results. You can't escape paid results anywhere on MSN. Moreover, it isn't
results by computer algorithm alone; human editors are involved in filtering
the results you get. Here is what I found out and how.
First, I noted that one person on Slashdot reported running a search for
"George W. Bush" and got some odd results:
"Try George W. Bush. I just did. It'll say something on the order of 301 hits.
Scroll through the pages to the last hit. Suddenly, the number jumps through
the roof. I find it hard to believe that the first 300 hits are all sponsored
links. I think something else is going on here: MSN has not only sponsored
links, but some kind of edited directory scheme going here, and it doesn't
care to let you know that the first number it quoted is of those links which
are sponsored or added editorial, and the second number is a raw search
I did the same search and I agreed it couldn't be all paid results, unless the
White House pays MSN for putting an official bio of the First Lady high on
MSN's list. I don't think they'd use our tax dollars to do that. But sure
enough, when you scroll through the 305 results they first promised to show
you, if you continue by clicking "Next" again, it jumps to 1,153,228 results,
with no explanation on how 305 just became a million plus.
I next did a search on MSN for "search engines" (with and without the
quotations) and search and invariably, MSN comes up first. I never found
Google at all. I stopped looking for it after the 300th result for search
engines and 200 for search. Google was simply not findable in any reasonable
way on MSN. Maybe you can find it, but I tried twice and I couldn't.
They listed things like "Internet Public Library" and "Recipes search engines"
and "Nerdworld", "Looksmart", "LinkMe.com", "Napster", and "Korean Search
Engines Dot Com", and even a dead link to an old 1996 CNET article called
"Can You Trust Your Search Engine?" -- but no Google. What possible algorithm
could make that happen, without human intervention? There has to be something
wrong when you can't find Google in a search for "search engines". They want
to buy it, but they don't want you to find it?
I then went to Google, and I ran the same searches. You can find MSN on Google
just fine, on page 2. It doesn't list itself first, either. They are number
4, with Yahoo and Alta Vista ahead of it when you do a search for "search."
When you search for "search engines" you get helpful things like Search
Engine Watch, number one on the list. By now, I'm thinking maybe MSN just
isn't a good search engine if you are looking for actual information, as
opposed to what MSN will let you find. Was it true, though, that after the
first few hundred paid search results, you could reach the rest of the
Rather than assume, why not ask Microsoft, I thought? Surely they know how
they built their search engine, no?
If you go here and then click on "About MSN Search results" on the top of the
list, you get to their page describing the results you may get on an MSN
"Depending on what you search for, the following categories of results may
"Web Directory Sites
"Broaden Your Search
"If any of the above categories don't appear in your results, it simply means
that no Web sites in that category were relevant to your search."
Each item is clickable to more information explaining what it means. Microsoft
has it set up so that you can't link there directly, natch, but if you follow
the links, here is what you will find for these items:
First, Popular Topics:
"Popular Topics results help you refine your search by suggesting related
topics. Clicking one will start a new search and display a new results page.
The most relevant or popular topics will be displayed first.
"Popular Topics results appear at the top of the first search results page,
but won't appear for all searches."
Next, Featured Sites:
"Featured Sites are links that MSN Search editors believe are likely to be
particularly relevant and useful. These sites are chosen from ones published
by MSN affiliates, partners, sponsors, and advertisers, as well as other
sites proven to be especially popular among our users. Featured Sites that
best match your search words are drawn from:
"The top sites for news in entertainment, sports, business, and politics.
"The most popular musical artist sites for biographies and song samples.
"MSN Encarta for encyclopedia information.
"MSN content partners.
"MSN advertising partners. (Microsoft accepts payment for listings from
Next, Sponsored Sites:
"Sponsored Sites are paid links provided to MSN Search and other Web search
engines by a third party. The third party ranks the sites based upon bids
received from advertisers, as well as their relevance to search words and
"To highlight their special nature, MSN Search labels sponsored sites as
"Sponsored Sites that best match your search words appear:
"Only when you perform a basic search.
"On the first page of results.
"On subsequent result pages if additional Sponsored Sites are available.
"When your search words are terms that Web sites have bid on."
Next, Web Directory Results:
"Web Directory results contain Web sites within the MSN Web Directory that
best match your search words.
"Within Web Directory results, there may also be links where the Web site
owners have paid for the expedited review of their site or for clicks to
their site. These sites are ranked using the normal algorithm applied to all
links within each section, with no change in rank due to payment."
Next, Web Pages:
"Web Page results include all other Internet-wide Web sites that best match
your search words.
"Within Web Page results, there may be links where the Web site owners have
paid for either expedited review of their site or paid for clicks to their
site. These sites are ranked using the normal algorithm applied to all links
within each section, with no change in rank due to payment."
So it seems there is no way to escape paid-for results on MSN, no matter how
hard you try. That isn't the most alarming part. The scariest on the entire
list to me is the Featured Sites explanation, about the "links that MSN
Search editors believe are likely to be particularly relevant and useful".
There's the human intervention. Now you're talking scarey. This is, after
Useful to whom? To Microsoft or to me? If I run a search for "search" I
probably do want to know about Google.
If I am looking for info on GNU/Linux, I probably don't want MS editors
deciding for me what is most useful. And if I am looking for facts about the
government or whatever, I especially don't want humans with an agenda, any
agenda, filtering for me. I have a brain that I trust to do that filtering.
It's one thing for a company to want to conrol a market; it's another when it
tries to control what you know.
So what, you may say? Just don't use it, if you don't want paid results.
Trust me, I don't and I won't. This was strictly in the line of duty. I will
never use it again, and I will explain to everyone I know what I found out
about MSN Search. But if they bought Google? Then what?
So, now you know what Microsoft thinks a search engine should be: just another
way to use customers to get a competitive advantage. They have no concept of
the public interest, I discern, from the design of their search engine. It's
all about Microsoft and their friends. That same blind spot is likely what
keeps them from understanding the value of the GPL and the freedoms it
And that, ladies and gentlemen, is why I use Google instead of MSN and will
throw up if Microsoft buys Google. Then I will stop using Google. After that,
some genius or other will just write another search algorithm and I'll use
that search engine instead. I hope you are working on it now, actually,
whoever you are. Release it under the GPL, will you, so Microsoft et al can't
buy it and ruin it? That's the thing about freedom. Humans just can't stop
wanting it. We're wired that way.
If Microsoft were not a monopoly, and if they didn't have MSN set as the
default search engine, maybe n
NYLXS: New Yorker Free Software Users Scene
Fair Use -
because it's either fair use or useless....
NYLXS is a trademark of NYLXS, Inc