Wednesday, April 19

Live Drive and GDrive, The Ferraris of cluster-based storage services on the information superhighway

Web Scale Grid Computing Platforms
Google has TeraGoogle. Microsoft has The Platform. Amazon has S3. Think Grid Computing a la IBMs vision of massive Linux clusters running on blade servers. Now begin thinking about the Google File System (GFS), its ability to adapt and re-configure itself in light of hardware failures. It is all very reminiscent of IBM's vision of "Autonomic" and "On Demand" computing. While IBM is focused on mult-million dollar business service contracts, Google is focused on advertising supported consumer services, and Microsoft is creating its next generation of applications to be more web service centric with greater emphasis on advertising supported business models, perhaps IBM should take a hard look.

The forementioned companies all are making massive capital expenditures on computing hardware and bandwidth to build extremely large, commodity hardware based massive computing grids. IBM is doing it for business in its "On Demand" initiative and Google and Microsoft are (or will be) doing it for the consumer, and to some extent, business markets. These massive-scale distributing computing platforms will power the next generation of internet services. Google and Microsoft are each spending massively: Billions of dollars on Millions of servers! I am not exaggerating, it is in the financials, see the Google Analyst Day presentation.

We are seeing the Walmart-ization of distributed computing. Google, in particular, has created one of the lowest cost (to build and mantain), highest performing, grids in the world. This is one of its real innovations. The growth of these large clusters of commodity hardware signficantly lowers the cost per gigabyte of storage to quite reasonable dollar amounts. Lower than what consumers could buy themselves -- there are significant economies of scale at work. Software infrastructure like GFS and IBMs General Parallel File System (GPFS) are making TeraGoogle and IBM's grid computing a reality.

Free disks for everyone!
Sometime in the near future, the end of 2006, or maybe sometime in 2007, I predict that both Google and Microsoft will release online storage services, GDrive and Live Drive, respectively. Amazon is already doing this for software developers through its S3 services, that it launched in March. IBM already has a product for large businesses.

However, MS and Google will bring mass network storage to the masses. Their goal: Store all the users' information on their platforms. Not a new concept (think mainframes and thin clients), but one that is finally starting to gain traction on the internet. We are talking about hundreds of Gigabytes per user, pedabytes and pedabytes of storage. A recent article, Microsoft's new brain, on CNN Money lays it out:
Microsoft is planning to use its server farms to offer anyone huge amounts of online storage of digital data. It even has a name for that future service: Live Drive. With Live Drive, all your information - movies, music, tax information, a high-definition videoconference you had with your grandmother, whatever - could be accessible from anywhere, on any device.

Google apparently has similar plans. An internal memo [analyst day slides] accidentally posted online in March spoke of company efforts to "store 100 percent of user data" and mentions an unannounced Net-storage system called GDrive.
Searching your data would be faster, more reliable, and more relevant if it was stored on Google or Microsoft's high speed grid. Storing your information online would solve many problems inherent in today's "Desktop Search." The meta-data on your files (author, date of creation, modification, etc...) won't changed or be corrupted, a major problem in today's desktop search. Personal searching is no longer limited by the hardware and software constraints of the desktop PC. The result is faster and more reliable access to your information. And just maybe its encrypted so there are no prying eyes elsewhere.

Because these services will have access to your information they will be able to better target advertising to your interests, also referred to as Personalization. Do you have pictures of Paris (the city, not Hilton)? Perhaps you want to plan your next vacation there -- and Google's advertising can help. Of course there are major privacy concerns, but these seem to be inherent in all targeted, contextual, advertising. After all, its contextual! In the past consumers have been wiling to sacrifice privacy for free services, and I believe they will continue to do so. After all, the offer of unlimited storage space and free wi-fi access is pretty compelling!

The growth of web-scale services and different audiences

Only very few companies have the resources necessary to build these platforms and deploy services like the forementioned at scale. IBM does, but it seems content to build the software (Websphere) that powers the "On Demand" economy and rent computing grids to large Fortune 500 companies and provide integration and support via Global Services rather than create consumer facing applications. It was the same back in the 90s when it came to search.

IBM had the technology and resources to build Google in their WebFountain project, but the company lacked the will to create a consumer oriented service. Instead, they turned it into a business analytics engine, which was never really profitable. Consumer services weren't profitable because the business model (targeted ads), hadn't matured. Right now IBM is facing a similar problem in its On Demand platform as it did in search, businesses aren't interested in renting computational power from IBM for large sums of money with long contracts. IBM doesn't seem to have the will to create consumer facing services with advertising and subscription based business models. In the past, Microsoft has traditionally taken the same stance, but times are changing in Redmond with Ray Ozzie. Google's success has given them a wake-up call.

New Business Models (at least for MS)
In a recent memo, Ray Ozzie, the new Microsoft CTO and creator of Lotus Notes lays out one of the new key tenets at the "new" Microsoft, first:
The power of the advertising-supported economic model.
From my (limited) experience, consumers don't like to pay for software (or services), but they seem to be willing to watch targeted, relevant advertising. Google has proven this with their success. Until recently, Microsoft has left this massive revenue stream virtually untapped. Now, instead of charging for some of its services, more of them will be paid for through advertising, with premium advertising free versions, of course. Perhaps someday they'll stop charging for Encarta Online! This is new strategy for Microsoft and one that is still hotly debated internally and externally.

After all, according to previously mentioned Fortune article, more than three times the revenue spent on software is spent each year on advertising, and a growing proportion of that money is being spent online. The key is accountability. Google's CPC ads are the most relevant and targeted advertising system seen to date and their auction system sets prices based on supply and demand. For those of you who remember the late 90s, advertising has come a long way since 468x60 banner ads at $2 CPM. Google has pioneered this and been sucessful, now Microsoft is starting to wake up to the reality that well targeted advertising makes money.

New Technology
There are several major technical forces driving this new era of technical innovation (sometimes referred to as Web 2.0) : broadband and browser adoption. The biggest driver behind this shift towards web based services is the growth of broadband. After all, what good is having unlimited storage on the network if can't access it quickly? To this end, Google and Earthlink are partnering to make broadband truly ubiquitous (at least in large urban areas). In San Fransisco, San Diego, Philadelphia, New York, and other major metropolitan areas Google is providing the advertising and Earthlink is providing the wireless infrastructure. Meanwhile Verizon is rolling out fiber to the home and spending billions on broadband infrastructure. Broadband isn't there yet, but it has reached a critical mass (at least in the US) of > 100M subscribers (From Google Analyst Date notes) and continues to grow.

Lastly, semi-modern browsers have finally become widely adopted. For all its flaws, IE6 is a semi-capable browser. IE7 promises to be much better. This technology adoption has lead to the growth of DHML and CSS, enabling dynamic AJAX based applications. Both Microsoft and IBM are leading the way here with the next generation of web development tools. Microsoft recently released Atlas for Vistual Studio / ASP.Net. IBM just open sourced the AJAX toolkit as part of the Eclipse ATF project. Google has the head Firefox developers, Microsoft has IE. Both teams will work hard to ensure a good "it just works" experience for consumers with their next gen services.