I will be leaving Yahoo! at the end of this month to join Cloudera.
About five years ago I was working with Mike Cafarella on Apache Nutch, an open-source web-search engine. Initially we were able to crawl and index on four machines in parallel, but with a lot of manual steps. Inspired by two Google papers, we implemented a distributed filesystem and MapReduce implementation that automated most of these steps. Operation became much simpler, and we were then able to easily run Nutch on twenty machines, with near-linear scaling.
But to scale to the many billions of pages in the web we’d need to be able to run it on thousands of machines. And the more we worked on it the more I realized that would take a lot more developers and resources than we had to make this happen.
Yahoo! proposed to fill this gap. Eric Baldeschwieler led a team with talented folks, like Owen O’Malley, Sameer Paranjpye, and Nigel Daley. Eric said he’d dedicate his team to scaling this system to be able to process the full web. So, three and half years ago, I joined Yahoo! to help make this happen.
We exceeded my dreams. First we moved the distributed computing code out of Nutch into a new Apache project christened Hadoop. Then we set out to improve scalability, performance, and reliability, all the while adding many features. After one year Hadoop was used daily by many research groups within Yahoo!. After two years it generated Yahoo!’s web search index, achieving web-scale. Now, after three years, Hadoop holds the big-data sort record and the project has become a de-facto industry standard for big-data computing, used by scores of companies. The recent Hadoop Summit was attended by over 750 people from around the world.
Many folks at Yahoo! were instrumental in this story, including: Raymie Stata, Dhruba Borthakur, Arun C Murthy, Devaraj Das, Raghu Angadi, Hairong Kuang, Konstantin Shvachko, Runping Qi, Chris Douglas, Allen Wittenauer, Sharad Agarwal and Hemanth Yamijala, to name just a few. Yahoo! deserves enormous and ongoing thanks for the key role it plays in making Hadoop useful.
Now Hadoop is a thriving open-source project, with large and diverse developer and user communities. Going forward, Cloudera presents an opportunity to work with a wider range of Hadoop users. I hope to help synthesize these many voices into a project that best serves all.
Hadoop has grown to be a large, active, project very quickly, but it is still a young project. At Cloudera I will be well positioned to help it mature. This move will not fundamentally change my day-to-day activities. I will continue to work on Hadoop, working closely with developers from Yahoo! and elsewhere to build great software.
August 10, 2009 at 12:08 pm |
[…] — that Doug Cutting, co-founder of the Apache Hadoop project and creator of Nutch and Lucene, has agreed to join Cloudera beginning on September 1, 2009. Doug’s contributions to Hadoop over the […]
August 10, 2009 at 12:40 pm |
Congratulations Doug! :)
August 10, 2009 at 1:07 pm |
[…] http://blog.lucene.com/2009/08/10/joining-cloudera/ […]
August 10, 2009 at 2:29 pm |
Congratulations, and good luck!
August 10, 2009 at 5:50 pm |
[…] bring us to the last news item: Doug Cutting is leaving Yahoo for Cloudera, where he’ll continue to work on Hadoop. According to his blog post about it, […]
August 10, 2009 at 6:27 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 6:32 pm |
Good luck and keep up the good work!
August 10, 2009 at 6:59 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 7:08 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 7:08 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 7:12 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 7:18 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 9:17 pm |
[…] his blog post explaining the move, Cutting specifically states that he joined Yahoo in order to get the resources […]
August 10, 2009 at 9:55 pm |
Congratulations Doug… excelent decision :)
August 10, 2009 at 11:01 pm |
[…] his personal blog, Cutting says he’ll be doing much the same work with Cloudera that he was doing at Yahoo, and will continue […]
August 11, 2009 at 4:10 am |
[…] his Hadoop project. (Picture from Facebook.) The timing was coincidental, he insisted. In a blog post he heaped fulsome praise on Yahoo and said that at Cloudera he will be "well-positioned to help it […]
August 11, 2009 at 4:46 am |
Good Luck!!!!
August 11, 2009 at 5:49 am |
[…] Cutting, creator of open-source software framework Hadoop,has left Yahoo to join Cloudera, a Burlingame, Calif.-based startup that is commercializing Hadoop. The center of […]
August 11, 2009 at 12:19 pm |
Just adding my congrats to the pile!
August 11, 2009 at 7:19 pm |
Congrats, Doug!
August 11, 2009 at 8:44 pm |
Congrats, Good luck.
August 11, 2009 at 11:04 pm |
[…] timing was coincidental, he insisted. In a blog post he heaped fulsome praise on Yahoo and said that at Cloudera he will be “well-positioned to […]
August 12, 2009 at 11:53 am |
Good luck with Cloudera, glad to know you’ll still be involved in Hadoop.
August 12, 2009 at 2:22 pm |
[…] Doug Cutting is leaving Yahoo! and will be joining Cloudera. (Via Free Search) […]
August 12, 2009 at 6:48 pm |
[…] Joining Cloudera « Free Search […]
August 12, 2009 at 11:44 pm |
[…] – Doug Cutting deja Yahoo! para irse a Cloudera y seguir desarrollando […]
August 13, 2009 at 12:02 am |
[…] Joining Cloudera […]
August 13, 2009 at 11:03 pm |
[…] dimenticavo: ha lasciato […]
August 14, 2009 at 9:14 pm |
[…] search and infrastructure expert Doug Cutting is leaving the company to join Cloudera. He will be leaving Yahoo! at the end of August, 2009. Cutting created […]
August 17, 2009 at 1:53 am |
[…] Joining Cloudera […]
August 17, 2009 at 1:16 pm |
[…] search and infrastructure expert Doug Cutting is leaving the company to join Cloudera. He will be leaving Yahoo! at the end of August, 2009. Cutting created […]
August 18, 2009 at 6:47 am |
hearty congratulations….
August 21, 2009 at 11:19 am |
Doug, congratulation. Nutch was great since the very beginning and what you made after that was awesome. I cannot wait to see the rest.
September 14, 2009 at 10:46 pm |
hey Doug, you r a great guy. Now i use your frame work and learn it and make it for myanmar search engine
September 16, 2009 at 2:46 am |
Congratulations Doug, I was totally unaware of Lucene few days back untill I found this great article, simple though usefull, http://www.ezdia.com/Lucene_in_five_minutes/Content.do?id=674
September 20, 2009 at 3:33 pm |
When I started to read this post, my first thought was that you were moving away from your creation.
Thank you for clarifying that your move will allow you to help your “child” not only to walk, but to run.
October 6, 2009 at 8:31 am |
Hi, Doug
I am a research analyst based in Dubai, United Arab Emirates. I am currently researching about open source search engine in Arabic. Would greatly appreciate if you are able to provide me with your contact details through the e-mail address provided, since I am unable to get through Cloudera’s telephone number.
Thanks and kindest regards.
October 12, 2009 at 8:42 am |
[…] recent additions to the Cloudera team is Doug Cutting, a search engine specialist from Yahoo and one of the founders of the Hadoop project. This is a big loss for Yahoo and a huge gain for […]
October 15, 2009 at 8:56 pm |
Congratulations Doug – Good Luck
April 3, 2010 at 4:51 pm |
How would you like to be remembered?
October 13, 2010 at 4:05 am |
congrats Doug! good luck at Cloudera.
October 20, 2010 at 8:54 am |
Congratulations Doug!
We wish you all the best!
November 30, 2010 at 12:04 am |
Good luck!
It’s pitty that your leave Yahoo!
January 12, 2011 at 10:43 am |
i want to connect my mysql database to nutch so after crawling nutch will store all data in my mysql database please guide me how can i do this? i am very thankfull to you
February 28, 2011 at 1:26 am |
Hey There. I found your blog using msn. This is an extremely well written article. I’ll be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I’ll definitely return.
May 19, 2011 at 8:24 pm |
It is the best time to make some plans for the future and it is time to be happy. I have read this post and if I could I want to suggest you some interesting things or suggestions. Perhaps you can write next articles referring to this article. I desire to read more things about it!
September 5, 2011 at 10:55 pm |
Lesezeichen speichern…
[…]Joining Cloudera « Free Search[…]…
September 30, 2011 at 3:37 am |
you have to check this out Yahoo Web Search…
[…]Joining Cloudera « Free Search[…]…
December 11, 2011 at 4:37 am |
congratulations!! it was good to read this!
January 7, 2012 at 1:02 pm |
nice site, thanks for sharing the news!
January 15, 2012 at 9:18 am |
wow, very nice! thanks for sharing this information here!
January 18, 2012 at 1:17 am |
good article! I found it while searching for something else..
February 9, 2012 at 5:13 am |
wooow, really nice! keep working nice things!
April 9, 2012 at 2:39 am |
Good luck at Cloudera.
April 11, 2012 at 11:56 am |
well, that is something! thanks for sharing this here…
October 8, 2012 at 4:28 am |
This is worth of reading! Thanks for share!
February 25, 2013 at 10:45 pm |
Good work!!!…. Congratulations Doug!!!
April 4, 2013 at 5:00 am |
thanks thanks for sharing this here!
June 4, 2013 at 1:44 am |
Howdy! I just would like to give an enormous thumbs up for the good info you have got right here on this post.
I might be coming again to your blog for extra soon.
November 27, 2013 at 8:19 pm |
It is like you learn my head! You appear to fully grasp much about it, just like you had written a manual in it as well. I believe that you may apply a handful of per-cent to demand your message home a tad, but instead of that, it is outstanding web site. An amazing read. Let me definitely be back.
September 20, 2020 at 6:27 pm |
hi,
I am a programmer in the big data industry. I am writing a book on big data recently. But I am confused by the origin of Lucene, nutch and Hadoop. The specific incubation time of Lucene and nutch in Apache, the specific graduation time, and some stories about Lucene and nutch in the incubation period. I have found a lot of materials from Google, most of which have been taken in one sentence without specific description. I hope I can get your help. Thank you very much.