Nutch for intranets

I updated the Nutch tutorial last week to document how one can use Nutch for intranet search. I’ve made a bunch of changes in the past few weeks to support intranets. It should now be a lot easier to get started with Nutch…

3 Responses to “Nutch for intranets”

  1. KFThierry Says:

    Hello sir Cutting,
    I learn to programming. I’m a french student. I study nutch for the moment because i like your idea. I will like to specielize in java and linux technologies. The nutch API for me is not enough document. The nutch projet is the very idea. I would like to know if it’s possible to use the crawler for a intranet site web. I try this in the urls file: http://192.168.0.1/ and in this in the conf/url-filter.txt file: +^http://192.168.0.1/ but nothing. the server on the 192.168.0.1 host is Java Server Web Page and works well. how i can do to crawl the intranets urls please.
    Thanks.

  2. Anonymous Says:

    Hello Dough,
    Apologies…

    Noutch tutorial needs more clarity for new bees like
    us [regarding installation / developmental usage]

    Since many of the developers still use windows as the
    primary Dev-enviournament.

    Please try to have a seperate writeup of step by
    step installation /usage of the S/w in regards to the
    O/s ,it is installed upon.

    Thx

    with regards
    Karthik.n.s
    09/06/2004

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: