Friday, August 7

Yahoo! is Becoming a Newspaper

The NY Times has an interview with Carol Bartz. One highlight is that she claims that Y! was never a search engine:
“We have never been a search company,” she said. “It is: ‘I am on Yahoo. I am going to do a search.’ ”
Danny Sullivan takes issue with her interpretation of history.

As for the future, it sounds like Y! will become an online news and media organization:
“My fortunes are tied to my pages,” Ms. Bartz said... Its Sports section, for example, has reporters producing top-notch original material ranging from scoopy news items and blogs to long-form analysis pieces.
Y! has had a long internal battle over whether it is a media company or a software company. By divesting Y! of it's biggest software systems, Ms. Bartz seems to be making the direction clear. Y!'s future will be in media, not software technology. In short, it means getting rid of developers and servers farms and hiring reporters in their place. The direction is ironic because it's a throwback to Y!'s past as a manually created directory which it abandoned in favor of technology driven search algorithms.

Considering the recent fortunes of online media companies and news organizations it seems like the wrong direction. But, maybe she can pick up all those unemployed journalists on the cheap and run a better organization. Better yet, maybe she can get users involved in creating the content for her with more tools like Y! Answers.

These are sad words to hear and I believe the wrong choice for the future of the company. However, as a developer and researcher, I'm only slightly biased. Just a little bit.

Netflix Prize Sequel: A Sprint instead of a marathon

Neil Hunt, Chief Product Officer at Netflix announced in the forum that they will launching a sequel to the very their successful recommender competition.
The advances spurred by the Netflix Prize have so impressed us that we’re planning Netflix Prize 2, a new big money contest with some new twists.

Here’s one: three years was a long time to compete in Prize 1, so the next contest will be a shorter time limited race, with grand prizes for the best results at 6 and 18 months.
Stay tuned for more news in September after they formally announce the winner of round one.

Wednesday, August 5

News of the day: Eclipse AppEngine Plugin, New Chome Beta, Lucene Payloads, and MS-Y! SEC Filing

Google released a new Google Plugin for Eclipse 3.5. It provides support for building and deploying applications on App Engine in Eclipse.

Grant has a post introducing Lucene payloads. This is the primary way to get token/term level metadata into a lucene index. See also Michael Busch's slides from the SF Lucene/Solr meetup. If you are a Lucene power user, you should get into it.

SeLand has coverage of the SEC documents on the MS-Y! search deal that go into more details, if you care. One tidbit from the documents is how the merge will take place:
Microsoft will hire not less than 400 Yahoo! employees (the “Transferred Employees”) and will offer the Transferred Employees market competitive compensation packages. In addition, Yahoo! and Microsoft will mutually agree on a retention plan to be paid for by Microsoft to assist in retaining the Transferred Employees and an additional 150 Yahoo! employees to be mutually agreed upon between Microsoft and Yahoo! to assist with providing the transition services.
Google released a new Chrome beta for windows. I'm still waiting for the Linux version.

Tuesday, August 4

Quick Links of the day: Tuesday

  • Daniel has been consistently rolling out articles covering the SIGIR industry track. Here are a few highlights:
The slides from Nick Craswell's SIGIR Industry Day talk on Query Modeling at Bing are available.

Vanja Josifovski gave a talk on computational advertising, presenting "Ad Retrieval – A New Frontier of Information Retrieval". The comments in Daniel's Post are worth reading! It's clear that ad retrieval research is still controversial.

Evan Sandhaus's from the NY Times R&D Labs also presented and a lot of what he covered are available in previous slides. He highlighted the NY Times Annotated Corpus, which you should take a look at, "The New York Times Annotated Corpus is a collection of over 1.8 million articles annotated with rich metadata published by The New York Times between January 1, 1987 and July 19, 2007"

Monday, August 3

Quick Links For the Day

I'm catching up from being away last week on vacation. Here are my quick links to read up on more:

Are Academic Conferences Broken? and Why Go to Conferences?
Time for Computer Science to Grow up?
A lot of discussion about Computer Science conferences and the future.

On a somewhat related note: Daniel Lemire blogs about why he doesn't blog his ongoing research.

iPhone app development course at Stanford is available online. Google also announced App Inventor for teaching mobile app development. It's a drag and drop GUI for Android apps.

Future of search podcast, featuring the Anand Rajaraman, the founder of Kosmix.

BellKor's Pragmatic Chaos, the winning team on the Netflix prize, has a portal page for coverage of the news.

Evri announced a Javascript API.

Cloudera has an intensive app development tutorial for Hadoop.

The future of Hadoop: Don't panic, yet

The recent MS-Y! deal has a alot of people scared about what it means for Y!'s support of Hadoop. Y! search uses Hadoop to create its "WebMap" and is the largest Hadoop customer. See my previous coverage on the State of Hadoop talk at the Hadoop summit, where the search application was one of the primary featured applications. In fact, the cost of running and continually expanding the search clusters was likely a factor in Carol's decision to stop investing in search and to reallocate resources elsewhere.

Given the importance of Hadoop as an infrastructure tool for search there is a lot of uncertainty about the future. For example, The Register wrote an article titled: Microsoft pact holds gun to stuffed elephant. To counter the uncertainty and fear Eric has a post telling people not to panic! and that Yahoo! is still very committed to using Hadoop for infrastructure.

Despite this reassurance, Hadoop is losing a big customer driving requirements and changes that make it a better platform for building search applications, unless a miracle happens and some variant is adopted by Microsoft. The loss may not have short-term impact, but will change the long-term direction of the project as it focuses on being relevant to other teams and problems that are aligned with Y!'s new goals and strategies.

New York Times article on Microsoft-Yahoo Search Deal

In case you missed it, the NY Times ran an article yesterday on the Microsoft-Yahoo! search deal.

The decision Ms. Bartz had to make was go big or give up; she decided to give up:
Ms. Bartz said she sold the search business because Yahoo could no longer continue to match the level of investment Google and Microsoft were making in searching...
There's also and interesting quote where she sees the goals for the company:
"My first reaction when I got here was that I wouldn’t even do a search deal... until I looked at our expense structure and our actual options and looked at what our prime job was, which is to grow audience."
I read this as saying that search is taking a disproportionate share of resources to run an maintain without being a leader in the market. She would rather spend the money on creating top notch properties; developing content/media for consumers.

The article briefly mentions the potential impact on jobs at Y!, saying that it could mean that 400 or more jobs could hit by layoffs.