Over the past two weeks, we conducted 200 queries across the three major Search engines–Google, Yahoo! and Bing...After conducting the same query across all three Search sites, we picked a winner based on: 1) relevancy of the organic search results; and 2) robustness of the search experience, which included factors such as image and video inclusion, Search Assist, and Site Breakout...According to the charts, Google returned the most relevant result 71 percent of the time, compared with Bing at 49 percent of the time and Yahoo 30 percent of the time.
Until I get more details on the study, which I wasn't able to find anywhere, I'm highly skeptical of the findings. It has serious flaws in its methodology.
Here are some reasons:
- Poor Metrics. The by "most relevant result" is not a good (or standard) metric. And "robustness of the search experience" is not clearly defined.
He should've used standard metrics that people understand: precision@1, precision@3, or something similar. Or he could've conducted a user study and measure how long it took them to accomplish a standard task.
- "Relevance" is not defined. Binary, graded, etc... See the Google rater guidelines (summary).
- Annotator agreement. How many people rated each query? Could they agree with one another? Did they look at the result pages or just the snippets? These are important questions.
- Query selection. Taking queries from popular queries is very biased. A more realistic/random sample should be done, gathered from a company like Compete.