Joining Cloudera

I will be leaving Yahoo! at the end of this month to join Cloudera.

About five years ago I was working with Mike Cafarella on Apache Nutch, an open-source web-search engine. Initially we were able to crawl and index on four machines in parallel, but with a lot of manual steps. Inspired by two Google papers, we implemented a distributed filesystem and MapReduce implementation that automated most of these steps. Operation became much simpler, and we were then able to easily run Nutch on twenty machines, with near-linear scaling.

But to scale to the many billions of pages in the web we’d need to be able to run it on thousands of machines. And the more we worked on it the more I realized that would take a lot more developers and resources than we had to make this happen.

Yahoo! proposed to fill this gap. Eric Baldeschwieler led a team with talented folks, like Owen O’Malley, Sameer Paranjpye, and Nigel Daley. Eric said he’d dedicate his team to scaling this system to be able to process the full web. So, three and half years ago, I joined Yahoo! to help make this happen.

We exceeded my dreams. First we moved the distributed computing code out of Nutch into a new Apache project christened Hadoop. Then we set out to improve scalability, performance, and reliability, all the while adding many features. After one year Hadoop was used daily by many research groups within Yahoo!. After two years it generated Yahoo!’s web search index, achieving web-scale. Now, after three years, Hadoop holds the big-data sort record and the project has become a de-facto industry standard for big-data computing, used by scores of companies. The recent Hadoop Summit was attended by over 750 people from around the world.

Many folks at Yahoo! were instrumental in this story, including: Raymie Stata, Dhruba Borthakur, Arun C Murthy, Devaraj Das, Raghu Angadi, Hairong Kuang, Konstantin Shvachko, Runping Qi, Chris Douglas, Allen Wittenauer, Sharad Agarwal and Hemanth Yamijala, to name just a few. Yahoo! deserves enormous and ongoing thanks for the key role it plays in making Hadoop useful.

Now Hadoop is a thriving open-source project, with large and diverse developer and user communities. Going forward, Cloudera presents an opportunity to work with a wider range of Hadoop users. I hope to help synthesize these many voices into a project that best serves all.

Hadoop has grown to be a large, active, project very quickly, but it is still a young project. At Cloudera I will be well positioned to help it mature. This move will not fundamentally change my day-to-day activities. I will continue to work on Hadoop, working closely with developers from Yahoo! and elsewhere to build great software.

61 Responses to “Joining Cloudera”

  1. Doug Cutting joins Cloudera » Cloudera Hadoop & Big Data Blog Says:

    […] — that Doug Cutting, co-founder of the Apache Hadoop project and creator of Nutch and Lucene, has agreed to join Cloudera beginning on September 1, 2009. Doug’s contributions to Hadoop over the […]

  2. Otis Gospodnetic Says:

    Congratulations Doug! :)

  3. 451 CAOS Links (caostheory) 's status on Monday, 10-Aug-09 21:06:51 UTC - Identi.ca Says:

    […] http://blog.lucene.com/2009/08/10/joining-cloudera/ […]

  4. Yoav Shapira Says:

    Congratulations, and good luck!

  5. Lots of Search News Today! | The Noisy Channel Says:

    […] bring us to the last news item: Doug Cutting is leaving Yahoo for Cloudera, where he’ll continue to work on Hadoop. According to his blog post about it, […]

  6. Why The Brain Behind Hadoop Left Yahoo | Family Learning Center Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  7. Ricardo Niederberger Cabral Says:

    Good luck and keep up the good work!

  8. Why The Brain Behind Hadoop Left Yahoo | yKvz Blog Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  9. Why The Brain Behind Hadoop Left Yahoo | koala eye Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  10. WebHosts 2009» Blog Archive » Why The Brain Behind Hadoop Left Yahoo Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  11. Why The Brain Behind Hadoop Left Yahoo - ComponentGear.com Feed - ComponentGear.com Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  12. Why The Brain Behind Hadoop Left Yahoo | Social Nibble Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  13. Why The Brain Behind Hadoop Left Yahoo | google android os blog Says:

    […] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]

  14. mariano Says:

    Congratulations Doug… excelent decision :)

  15. Doug Cutting Leaving Yahoo | Search Engine Optimization & Internet Marketing (SEO & SEM) Blog Says:

    […] his personal blog, Cutting says he’ll be doing much the same work with Cloudera that he was doing at Yahoo, and will continue […]

  16. Open Source mobile edition Says:

    […] his Hadoop project. (Picture from Facebook.) The timing was coincidental, he insisted. In a blog post he heaped fulsome praise on Yahoo and said that at Cloudera he will be "well-positioned to help it […]

  17. Sameer Says:

    Good Luck!!!!

  18. Friends on the Move: Hadoop, AOL & PayPal Says:

    […] Cutting, creator of open-source software framework Hadoop,has left Yahoo to join Cloudera, a Burlingame, Calif.-based startup that is commercializing Hadoop. The center of […]

  19. Daniel Tunkelang Says:

    Just adding my congrats to the pile!

  20. Greg Linden Says:

    Congrats, Doug!

  21. Edward J. Yoon Says:

    Congrats, Good luck.

  22. Cutting out for Cloudera just in time | WebFroster Says:

    […] timing was coincidental, he insisted. In a blog post he heaped fulsome praise on Yahoo and said that at Cloudera he will be “well-positioned to […]

  23. Avi Rappoport Says:

    Good luck with Cloudera, glad to know you’ll still be involved in Hadoop.

  24. MischMasch 2009-08-11 – portenkirchner.net Says:

    […] Doug Cutting is leaving Yahoo! and will be joining Cloudera. (Via Free Search) […]

  25. Where can I find a Ice Blue Nintendo DS Lite? | Nintendo Store Says:

    […] Joining Cloudera « Free Search […]

  26. Mininoticias de la semana (10/08 – 16/08) Says:

    […] – Doug Cutting deja Yahoo! para irse a Cloudera y seguir desarrollando […]

  27. Shared Items - August 12, 2009 « Jeetu’s Shared Memory Says:

    […] Joining Cloudera […]

  28. Emorragia continua « MarcoOnaCloud Says:

    […] dimenticavo: ha lasciato […]

  29. Doug Cutting Leaving Yahoo! To Join Cloudera | MULTIMAP Says:

    […] search and infrastructure expert Doug Cutting is leaving the company to join Cloudera. He will be leaving Yahoo! at the end of August, 2009. Cutting created […]

  30. Doug Cutting joins Cloudera « すでにそこにある雲 Says:

    […] Joining Cloudera […]

  31. Doug Cutting Leaving Yahoo! To Join Cloudera | Traffficum.com - Traffic for your website Says:

    […] search and infrastructure expert Doug Cutting is leaving the company to join Cloudera. He will be leaving Yahoo! at the end of August, 2009. Cutting created […]

  32. mrdhana Says:

    hearty congratulations….

  33. Antonio Gulli Says:

    Doug, congratulation. Nutch was great since the very beginning and what you made after that was awesome. I cannot wait to see the rest.

  34. sithuhliang Says:

    hey Doug, you r a great guy. Now i use your frame work and learn it and make it for myanmar search engine

  35. Sulabh Says:

    Congratulations Doug, I was totally unaware of Lucene few days back untill I found this great article, simple though usefull, http://www.ezdia.com/Lucene_in_five_minutes/Content.do?id=674

  36. SEOConsulting Says:

    When I started to read this post, my first thought was that you were moving away from your creation.

    Thank you for clarifying that your move will allow you to help your “child” not only to walk, but to run.

  37. Jennifer Malapitan Says:

    Hi, Doug

    I am a research analyst based in Dubai, United Arab Emirates. I am currently researching about open source search engine in Arabic. Would greatly appreciate if you are able to provide me with your contact details through the e-mail address provided, since I am unable to get through Cloudera’s telephone number.

    Thanks and kindest regards.

  38. Hadoop and Cloudera: Open Source for the Cloud Says:

    […] recent additions to the Cloudera team is Doug Cutting, a search engine specialist from Yahoo and one of the founders of the Hadoop project. This is a big loss for Yahoo and a huge gain for […]

  39. parfume Says:

    Congratulations Doug – Good Luck

  40. Martha Says:

    How would you like to be remembered?

  41. Igre Says:

    congrats Doug! good luck at Cloudera.

  42. Ivan Says:

    Congratulations Doug!

    We wish you all the best!

  43. 好建议 Says:

    Good luck!

    It’s pitty that your leave Yahoo!

  44. peeyushchandel Says:

    i want to connect my mysql database to nutch so after crawling nutch will store all data in my mysql database please guide me how can i do this? i am very thankfull to you

  45. Shaina Siegrist Says:

    Hey There. I found your blog using msn. This is an extremely well written article. I’ll be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I’ll definitely return.

  46. Hyde Park Hotels Says:

    It is the best time to make some plans for the future and it is time to be happy. I have read this post and if I could I want to suggest you some interesting things or suggestions. Perhaps you can write next articles referring to this article. I desire to read more things about it!

  47. Lesezeichen speichern Says:

    Lesezeichen speichern…

    […]Joining Cloudera « Free Search[…]…

  48. Great, Check This Out, Yahoo Web Search Says:

    you have to check this out Yahoo Web Search…

    […]Joining Cloudera « Free Search[…]…

  49. Asi serija Says:

    congratulations!! it was good to read this!

  50. Nepobedivo Srce Says:

    nice site, thanks for sharing the news!

  51. Miris Proleca Says:

    wow, very nice! thanks for sharing this information here!

  52. Folk Serija Says:

    good article! I found it while searching for something else..

  53. Sulejman Says:

    wooow, really nice! keep working nice things!

  54. Billig Parfume Says:

    Good luck at Cloudera.

  55. Ljubav i Osveta Says:

    well, that is something! thanks for sharing this here…

  56. Sever Jug Says:

    This is worth of reading! Thanks for share!

  57. Razeel Mohammed Says:

    Good work!!!…. Congratulations Doug!!!

  58. Ljubav i kazna Says:

    thanks thanks for sharing this here!

  59. cost of cosmetic surgery in dominican republic Says:

    Howdy! I just would like to give an enormous thumbs up for the good info you have got right here on this post.
    I might be coming again to your blog for extra soon.

  60. coaching Says:

    It is like you learn my head! You appear to fully grasp much about it, just like you had written a manual in it as well. I believe that you may apply a handful of per-cent to demand your message home a tad, but instead of that, it is outstanding web site. An amazing read. Let me definitely be back.

  61. yao.meng Says:

    hi,
    I am a programmer in the big data industry. I am writing a book on big data recently. But I am confused by the origin of Lucene, nutch and Hadoop. The specific incubation time of Lucene and nutch in Apache, the specific graduation time, and some stories about Lucene and nutch in the incubation period. I have found a lot of materials from Google, most of which have been taken in one sentence without specific description. I hope I can get your help. Thank you very much.

Leave a reply to Ivan Cancel reply