Open Source Desktop Search

It seems Ask Jeeves may release their desktop search application as open source. On meeting with Mozilla.org folks, Ask says:

Ask Jeeves Blog: Mozilla’s On Fire: “We discussed Ask Jeeves desktop search and the notion of open-sourcing it. We’re open at two levels. Contributing just the core desktop indexing technology or possibly the entire desktop search application. They discussed how/what they would evaluate before accepting a major piece of code/product contribution: code size, internationalization, etc. Whether or not we partner with Mozilla on this effort, Chris and team thought it was a good idea for us to pursue overall.”

Lot’s of folks think good open-source desktop search can already be easily implemented with tools like Lucene. But desktop search has exacting requirements.

  • Download size should be small, which rules out Java and C#, since you can’t afford to require a large runtime environment.
  • Lots of document formats must be supported. Yet many document format conversion tools are quite large, too large for inclusion. So, a good desktop search application might need to implement its own format converters, no small burden.
  • The performance requirements of desktop search are not too demanding, since the number of documents is unlikely to exceed a few million, but indexing must be unobtrusive. It needs to run in the background when the user is idle. Ideally it shouldn’t greatly disrupt the virtual memory working-set, or else, when the user returns the system will be sluggish. This probably requires platform-dependent code.

In the end, the core search and indexing code (like Lucene provides) is only a small part of the application, and Java, while cross-platform, requires a runtime that’s too big for convenient download, and doesn’t give easy access to platform-specific scheduling features.

The Beagle folks have defied these odds, albeit for a not-yet-mainstream platform.

There’s still hope for mass-market Lucene-based desktop search: GCJ is cross-platform, makes it easy to invoke platform-specifics, and may soon have a tiny runtime. A C++ port exists and a C port of Lucene is underway. Machines and networks keep getting faster; scheduling and download-size issues will diminish. In the meantime, perhaps Ask Jeeves will fill this gap.

21 Responses to “Open Source Desktop Search”

  1. Kevin Says:

    My comment was that GCJ could do this. I’ve been trying to get some time to work on a native app written in GCJ but just haven’t had the time.

    I have a trivial implementation of a desktop search based on Lucene but haven’t had time to release it! I’m such a bad OSS developer!

    http://www.peerfear.org/rss/permalink/2004/10/28/LotsOfInterestInLuceneDesktop/

  2. Search Engine Information Says:

    I prefer the small download size in the long run because I believe the browser requirements will disappear pretty soon.

    Mike

  3. fatcrab Says:

    where can i download the source of desktop Lucene myemail ddong0524@yahoo.com.cn

  4. miguel Says:

    One of the considerations has always been to keep Beagle portable to let it move to other platforms.

    The problem is that there is little demand for such engine today on Windows or MacOS X, considering that the space is fairly well served today and is likely going to improve. Anyways Firefox is a perfect example that I might be wrong.

    Now, regarding large runtime downloads: Mono can be cut in pieces, this is routinely done by folks distributing Mono-based applications on MacOS X: they only ship the libraries that they need, which usually amounts to four to six megabytes uncompressed.

    There are two other bits of good news: we have been working on a “linker” for .NET libraries which will help people in shipping only the bits they actually need: today the granularity is at the library level, in the future we will make this happen at the function level.

    The last good news is that Mono provides a mechanism to bundle the runtime, the libraries and the application into a single binary if they want to.

    Anyways, am big fan of all your work.

  5. Techknight Says:

    We do have few open source desktop search applications which I find are on their way to become stable and provide robust search features. Though nascent we should be soon seeing some action here. I have found two of them and mentioned my experience with them here.

  6. Anonymous Says:

    I have to admit that I’m not up to date on the desktop search scene. For example I don’t know why do we call it “desktop search” when it can be run on laptops and doesn’t connect to a computer’s wallpaper. And I don’t know what all the OSS options are. But anyway…

    I recently downloaded Windows Desktop Search that comes as an add-in for the MSN toolbar, and my complaints are two-fold. 1) it doesn’t let me limit my search to particular directories. 2) It doesn’t read enough meta-data in XML/HTML file formats. It is however a vast improvement over the Altavista desktop search I downloaded about six years ago- that thing wouldn’t find the documents I was looking for and at the same time gave my oodles of false positives.

    I don’t know what Google’s search is like because I haven’t tried it. But I hope the open source technologies can provide good competition for the proprietary offerings.

    BTW, just my two cents: a small download size would be advantageous but shouldn’t be a requirment in desktop search. Like the other guy said, download time really doesn’t matter much.

  7. Anonymous Says:

    Hello,
    Can I know any information regarding security features proovided by Lucene desktop search.
    Apart from that can I implement any other security feature in it like EFS to make it more powerful.
    Waiting for early response.
    Yours Truly,
    Vivek Singh.–>

  8. jwiz Says:

    I’ve found those two on SourceForge.net:
    http://sourceforge.net/projects/docfetcher/
    http://sourceforge.net/projects/docsearcher/

    The first one is quite easy to use, but only runs on Windows at the moment (though the project site states it is platform-independent) and it doesn’t support Microsoft Word documents (not yet?), while the second one does, but its GUI really needs some cleanup.

  9. niktu Says:

    Desktop Search, these i’ve found:

    Red-Piranha Search and Knowledge – Community Edition – Java J2EE Tomcat Lucene Xml Rdf
    http://red-piranha.sourceforge.net/

    SourceForge.net: Lucene desktop index
    http://sourceforge.net/projects/lucenedesktop/

    SourceForge.net: DocSearcher
    http://sourceforge.net/projects/docsearcher/

    SourceForge.net: DocFetcher
    http://sourceforge.net/projects/docfetcher/

    Main Page – Beagle
    http://beagle-project.org/Main_Page

    personally i’m using http://sourceforge.net/projects/lucenedesktop/
    mostlyu because i’ve found it early and like its spartan interface
    (umm, i assume that regexp file inclusion/exlusion masks can appeal only to programmers … thats plus for me, not neccesairly for average joe :)

  10. Diseño| Hosting| Computadoras Says:

    Diseño| Hosting| Computadoras…

    […]Open Source Desktop Search « Free Search[…]…

  11. movie download site reviews Says:

    movie download site reviews…

    […]Open Source Desktop Search « Free Search[…]…

  12. http://tinyurl.com/socigower32093 Says:

    Many thanks for posting “Open Source Desktop Search Free Search”.
    Iwill really be back again for more reading through and writing comments soon.
    With thanks, Winston

  13. Desert Nights Online Slots Says:

    Managing about the RTG or Real-time Gaming console, it provides above 100 online games to offer the selection that you just ought to have.

    Together with built-in storylines and also progressions, the particular i-Slots take something a new comer to this type that could maintain
    an individual finding its way back with regard to more and more.

  14. over here Says:

    The 5″ screen.

  15. best horse back training Maryland Says:

    best horse back training Maryland

    Open Source Desktop Search | Free Search

  16. best Milwaukee DUI lawyer Says:

    best Milwaukee DUI lawyer

    Open Source Desktop Search | Free Search

  17. carpet installers greenville Says:

    Hello! Do yoou know if they make any plugins
    to safeguard agaknst hackers? I’m kinda paranoid about losing everything I’ve worked
    hard on. Any tips?

  18. gopro hero 4 review Says:

    You are so awesome! I don’t suppose I’ve read through anything like that before.
    So great to discover someone with a few original thoughts on this issue.
    Seriously.. thank you for starting this up. This website is something that’s needed on the
    internet, someone with a bit of originality!

  19. click Says:

    click

    Open Source Desktop Search | Free Search

  20. seo Says:

    The first and foremost point that comes to mind when talking about the internet
    is the reach. You will really need SEO services on an ongoing
    basis. Some important features of the best SEO Company in Singapore are as follows ‘ It offers unlimited
    keywords ‘ The SEO services are result driven with
    white hat techniques ‘ It does not involve any outsourcing
    Importance Of SEO Company In Singapore These companies offer cost effective method to market your Singapore business.

  21. payday loan Says:

    Nice post. I was checking continuously this blog and I am impressed! Very helpful information particularly the last part :) I care for such info much. I was looking for this certain information for a long time. Thank you and good luck.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: