Nutch for intranets

I updated the Nutch tutorial last week to document how one can use Nutch for intranet search. I’ve made a bunch of changes in the past few weeks to support intranets. It should now be a lot easier to get started with Nutch…


3 Responses to “Nutch for intranets”

  1. KFThierry Says:

    Hello sir Cutting,
    I learn to programming. I’m a french student. I study nutch for the moment because i like your idea. I will like to specielize in java and linux technologies. The nutch API for me is not enough document. The nutch projet is the very idea. I would like to know if it’s possible to use the crawler for a intranet site web. I try this in the urls file: and in this in the conf/url-filter.txt file: +^ but nothing. the server on the host is Java Server Web Page and works well. how i can do to crawl the intranets urls please.

  2. Anonymous Says:

    Hello Dough,

    Noutch tutorial needs more clarity for new bees like
    us [regarding installation / developmental usage]

    Since many of the developers still use windows as the
    primary Dev-enviournament.

    Please try to have a seperate writeup of step by
    step installation /usage of the S/w in regards to the
    O/s ,it is installed upon.


    with regards

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: