Nutch jobs available

Would you like to be paid to work full time on Nutch?

I know of a few companies that are gearing up to build full-scale, commercial Nutch-based web search engines. They’re not ready to make public announcements about this, but if you think you might like to work on this, please send your resumé to jobs @, and I’ll forward it on. If I know you from Nutch or Lucene contributions, then I’ll add my recommendation.


7 Responses to “Nutch jobs available”

  1. Nata1DotCom Says:

    Wow you guys are behind Lucene? We work a little bit with the .NET Lucene
    team and are creating interfaces to leverage our technologies.

    I have a very interesting challenge for your team. Nata1 Unified is trying
    to find worthy oppononents for our search trials.

    Task – given 4 week development window, and 3 days indexing time, 3 equal
    computers for each team, in Different geographic areas, pass the trials,
    which would be given by a panel of judges – i.e. depth (did we index the
    valuable pages?), free text, and speed.

    I’m 90% sure I can get Microsoft money to help sponsor. I KNOW that we can
    raise a lot of awareness for both our platforms, and could create an
    incredible competition, not just ONE trial, but many trials over time, not
    just with Nutch, but with others, and hopefully others using Nata1 Unified.

    So should we let Microsoft and Google run the entire search business, or
    should we do something to instigate the greatest open source search
    initiative that ever existed?

    Let me know if your interested :-)… Don’t worry, I’ll get the money!


  2. Anonymous Says:

    Nutch uses lucene. So why bother.
    You’d better test against
    But aren’t they using lucene too?

  3. Anonymous Says:


    I was looking on the net for help on resolving problems with read/write concurrency on the Lucene index and lock files and came across this site. I’ve solved it now, and I’d just like to say how impressive and easy to use the Lucene engine is.

    A website I helping to develop ( or uses Lucene as the search engine to the backbone news service.

    I notice you are involved in the web crawling part of search engine development, and I have also built a web crawler which uses multiple virtual machines (I’ve tried using one VM for the whole program, but it can’t cope). The crawler is currently specific to football (I think you yanks call it soccer(!)) news which is retrieved using regular expressions, dropped into the database and indexed onto Lucene. How much have the people at Lucene looked at Web Crawling?

    One of the problems I’m having is removing documents from the index – how do you get the original id for a document once you have added it to the index? Also, how do you run the engine entirely in memory like Google do?

    Keep up the good work!

  4. Doug Cutting Says:

    If you have Lucene questions, please send them to the Lucene Users mailing list,


  5. Anonymous Says:


    We are using Lucene as the core component of our search-infrastructure for a commercial project and, at least we think so ;-), have made some significant improvements to it which I can maybe get to be open-sourced as soon as the project is released (so that we have time to clarify about probably “secret” code integrated, patents and so on).
    _If_ I can get this to work I’d really be interested further working on Lucene, specifically our new code, but I’ll have to make a living, right :-) so I wonder, are any jobs in this area still vacant?



  6. Doug Cutting Says:

    Some Lucene jobs can be found (using Lucene) at:

  7. Liberta-Togo Says:

    Ok, I actually have several questions .

    1.Firstly my friends and I are working on another search related project, My question here is How do you go about getting some sponsorship and whom to approach, which companies do it . We would be interested in Money or Hardware

