- Fast
- In Memory
- No Server Process
- Zero-configuration - No setup or administration needed.
- Small code footprint.
- Simple, easy to use API.
- Well-commented source code with 100% branch test coverage.
- Self-contained: no external dependencies.
- ranked searching -- best results returned first
- many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
- fielded searching (e.g. title, author, contents)
- sorting by any field
- multiple-index searching with merged results
- allows simultaneous update and searching
- flexible faceting, highlighting, joins and result grouping
- fast, memory-efficient and typo-tolerant suggesters
- pluggable ranking models, including the Vector Space Model and Okapi BM25
- configurable storage engine (codecs)
sqlite3 has built-in FTS but I would have to build a bunch of stuff around it.
xapian is a FTS engine but like with sqlite3 I'd have to build stuff.
elasticsearch would work but it's an external process that I'd have to run and it's an awful lot of overhead
Some kind of hosted elasticsearch or solr provider would work, but again lots of overhead and not free and I'm then dependent on their uptime.
Whistlepig is a small text search index. This is written in C. Small as in not very many features and not much code, but the features that are there are perfect for my needs:
- Full query language
- In-memory, in-process
- Arbitrary number of indexes for the same document
require 'rubygems'
require 'whistlepig'
document = "Hi there"
index = Whistlepig::Index.new "index"
entry = Whstilepig::Entry.new
entry.add_string "body", document
docid = index.add_entry entry
query = Query.new("body", "hi")
result = index.search(query)
assert_equal docid, result[0]
This controller suffers from nested if-else:
http://valve.github.io/blog/2014/02/22/rails-developer-guide-to-full-text-search-with-solr/
Came across lunr.js, a simple javascript library for full-text search in your browser.
Comes with a standalone command-line interface (CLI) client that can be used to administer SQLite databases.
Sources are in the public domain. Use for any purpose.
No comments:
Post a Comment