monkeSearch is an open-source, offline-first desktop search tool that allows you to find local files using natural language. Instead of relying on simple keyword matching, it uses embedding models to understand the semantic meaning of your query, allowing you to search for concepts, not just words. We made it Temporal Aware, such that can parse time-related queries like "documents from last week" and apply them as filters. The entire process, from indexing to searching, runs locally on your machine. Our project utilizes different backends optimized for each OS to deliver the best performance and user experience across platforms.
os.walk
file system crawl and ChromaDB for vector storage and retrieval.
Query Time: macOS demonstrates significantly faster average search times due to its LEANN backend.
Index Size: On-disk size of the embedding database for different numbers of files.
Indexing Speed: How many files per second each system can process during the initial build.
Search Speed: Disabling recompute yields lightning-fast searches (milliseconds), while enabling it dramatically slows down queries (seconds).
Space Savings: Enabling recompute creates a tiny index (over 90% smaller), saving significant disk space.
Build Speed: The initial indexing speed is nearly identical, as the main workload (embedding generation) is the same in both modes.
The "Recompute" feature on macOS offers a clear trade-off: enable it to save a massive amount of disk space at the cost of much slower search performance. Disable it for instant results, but with a larger storage footprint.
monkeSearch is an open source project. If you find it useful, please consider starring our repository on GitHub to show your support!
https://github.com/monkesearch/monkeSearch