Friday, April 23, 2010

Yahoo!'s Smart Investment: The Hadoop Community

hadoop-logo.jpgMore than 250 people attended a Hadoop developer event at Yahoo! this week, demonstrating again the level of interest the company has in open-source big data initiatives.



Yahoo! says it is the world's biggest Hadoop supporter. We say that's undoubtedly correct. Yahoo! supports community developer events throughout the world. In February it supported the first Hadoop! event in India. In June, it will host the Hadoop Summit.


Sponsor



Yahoo! is not always recognized for its cloud computing efforts but its deep commitment to Hadoop shows how the company views the ways that big data can be used to solve major technology issues such as spam.



Hadoop, according to Wikipedia, "is a Java software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data."



The developer conference eatured discussions from the Hadoop community, including a presentation about using it to fight spam lead and a discussion led by a lead engineer from Facebook.



Vishwanath Ramarao is director of anti-spam engineering for Yahoo! Mail. According to the Yahoo! developer blog, Vish described the intricate cat-and-mouse games played with spammers, and how Yahoo! uses Hadoop to abstract away the complexity of large scale data analysis and provide deep insight into spammer campaigns.



Yahoo! Mail antispam - Bay area Hadoop user group

Johhn Sichi, lead engineer for Facebook's data infrastructure team provided an overview of Facebook's work using Hadoop to manage data that is growing 8x annually, In March, 2008 traffic volume hit 200 GB per day. By the end of last year, traffic bumped to 12 terabytes per day.



Hadoop, Hbase and Hive- Bay area Hadoop User Group

Companies like Yahoo! and Facebook use Hadoop to organize data and process it from multiple sources. For instance, Facebook might use it to organize how it deploys its ad network.



Yahoo! may be on to the most powerful use for cloud computing or at least the most interesting. And it shows how the company is thinking about cloud computing and the ways it applies to its overall strategy.


Discuss





http://bit.ly/9Jgbmf

No comments:

Post a Comment