Mental Models For Search Are Getting Firmer (Jakob Nielsen’s Alertbox)

May 12, 2005

Jakob Nielsen, in an article titled Mental Models For Search Are Getting Firmer, provides more fuel for my claim that web search is a commodity. He warns against trying to change the search user interface. This argues that search engines should not try to distinguish themselves with fancy front ends. That leaves the backend, where innovation seems to have slowed as well…

Advertisements

Open Source Desktop Search

February 14, 2005

It seems Ask Jeeves may release their desktop search application as open source. On meeting with Mozilla.org folks, Ask says:

Ask Jeeves Blog: Mozilla’s On Fire: “We discussed Ask Jeeves desktop search and the notion of open-sourcing it. We’re open at two levels. Contributing just the core desktop indexing technology or possibly the entire desktop search application. They discussed how/what they would evaluate before accepting a major piece of code/product contribution: code size, internationalization, etc. Whether or not we partner with Mozilla on this effort, Chris and team thought it was a good idea for us to pursue overall.”

Lot’s of folks think good open-source desktop search can already be easily implemented with tools like Lucene. But desktop search has exacting requirements.

  • Download size should be small, which rules out Java and C#, since you can’t afford to require a large runtime environment.
  • Lots of document formats must be supported. Yet many document format conversion tools are quite large, too large for inclusion. So, a good desktop search application might need to implement its own format converters, no small burden.
  • The performance requirements of desktop search are not too demanding, since the number of documents is unlikely to exceed a few million, but indexing must be unobtrusive. It needs to run in the background when the user is idle. Ideally it shouldn’t greatly disrupt the virtual memory working-set, or else, when the user returns the system will be sluggish. This probably requires platform-dependent code.

In the end, the core search and indexing code (like Lucene provides) is only a small part of the application, and Java, while cross-platform, requires a runtime that’s too big for convenient download, and doesn’t give easy access to platform-specific scheduling features.

The Beagle folks have defied these odds, albeit for a not-yet-mainstream platform.

There’s still hope for mass-market Lucene-based desktop search: GCJ is cross-platform, makes it easy to invoke platform-specifics, and may soon have a tiny runtime. A C++ port exists and a C port of Lucene is underway. Machines and networks keep getting faster; scheduling and download-size issues will diminish. In the meantime, perhaps Ask Jeeves will fill this gap.

Open Source Desktop Search

February 14, 2005

It seems Ask Jeeves may release their desktop search application as open source. On meeting with Mozilla.org folks, Ask says:

Ask Jeeves Blog: Mozilla’s On Fire: “We discussed Ask Jeeves desktop search and the notion of open-sourcing it. We’re open at two levels. Contributing just the core desktop indexing technology or possibly the entire desktop search application. They discussed how/what they would evaluate before accepting a major piece of code/product contribution: code size, internationalization, etc. Whether or not we partner with Mozilla on this effort, Chris and team thought it was a good idea for us to pursue overall.”

Lot’s of folks think good open-source desktop search can already be easily implemented with tools like Lucene. But desktop search has exacting requirements.

  • Download size should be small, which rules out Java and C#, since you can’t afford to require a large runtime environment.
  • Lots of document formats must be supported. Yet many document format conversion tools are quite large, too large for inclusion. So, a good desktop search application might need to implement its own format converters, no small burden.
  • The performance requirements of desktop search are not too demanding, since the number of documents is unlikely to exceed a few million, but indexing must be unobtrusive. It needs to run in the background when the user is idle. Ideally it shouldn’t greatly disrupt the virtual memory working-set, or else, when the user returns the system will be sluggish. This probably requires platform-dependent code.

In the end, the core search and indexing code (like Lucene provides) is only a small part of the application, and Java, while cross-platform, requires a runtime that’s too big for convenient download, and doesn’t give easy access to platform-specific scheduling features.

The Beagle folks have defied these odds, albeit for a not-yet-mainstream platform.

There’s still hope for mass-market Lucene-based desktop search: GCJ is cross-platform, makes it easy to invoke platform-specifics, and may soon have a tiny runtime. A C++ port exists and a C port of Lucene is underway. Machines and networks keep getting faster; scheduling and download-size issues will diminish. In the meantime, perhaps Ask Jeeves will fill this gap.

Lucene lecture

November 23, 2004

I am presently giving a lecture on Lucene at the University of Pisa. At the end of the presentation I’ll search for this blog entry, hoping to find it on Technorati, which uses Lucene, thus demonstrating live incremental indexing.

Lucene lecture

November 23, 2004

I am presently giving a lecture on Lucene at the University of Pisa. At the end of the presentation I’ll search for this blog entry, hoping to find it on Technorati, which uses Lucene, thus demonstrating live incremental indexing.