I am presently giving a lecture on Lucene at the University of Pisa. At the end of the presentation I’ll search for this blog entry, hoping to find it on Technorati, which uses Lucene, thus demonstrating live incremental indexing.
Really appreciate the hard and thoughtful work you’ve put into Lucene. We’re evaluating it right now for a sizeable project (a searchable index of 870,000 html documents). We’ll probably have to draw upon the expertise of the Lucene gurus out there to make it work on our XServe.
Do you have anyone admitting to installing Lucene on the latest Tiger Server release? Might turn out helpful… may make no difference at all… time will tell us.
I am reimplementing it into our project, the previous developer ‘extracted’ the source code and modified it(badly) and it doesnt work as well..we have about 60,000+ documents, lucene is very very fast. one question that we are having issues, .. during indexing, of like 60,000 items, we occassionally have to update the index a ‘fullreIndexing’, if we stop in the middle it corropts the index, and we must start all over from scratch. has anyonee else have that issue any work around?? also..caching, i use the open source project EHcache http://ehcache.sourceforge.net/ehcache-constructs/ , with hibernate/spring, and its very good, … ever thougth about maybe intergrating this cache into lucene?? having an expiry policy on an index would be great, the need to reindex wouldnt be needed(unless the data gets dirty),… send me an email anyone if you want! email@example.com
We are implementing Lucene at http://www.ojor.com for initial 550,000 html. It is running great with high speed of searching. Once, it is ready, it will use to index the whole size currently about 4 millions pages. Lucene is a great product. By any chance do know product similar to “Nutch” which use c#? We are trying to do “webdbwriter” and “webdbreader” in C#