Met lots of folks at WWW 2004, including:
- Torsten Suel, who has done some great work on search optimization that Nutch/Lucene should adopt, among other things;
- Giuseppe Attardi, who showed me some impressive benchmarks of a fetcher that uses async i/o;
- Marc Najork, who wrote Mercator, a very extensible crawler that Nutch can learn from;
- … and lots of other folks whose names I cannot recall.
Someone suggested that Nutch should look at Lustre for our robust, distributed filesystem needs. Does anyone have any experience with Lustre?
Thanks to Rohit Khare for inviting me!