Lucene in 5 Minutes
Now updated for Lucene 2.9.1!
Lucene makes it easy to add full-text search capability to your application. In fact, its so easy, I'm going to show you how in 5 minutes!
1. Index
For this simple case, we're going to create an in-memory index from some strings.
Directory index = new RAMDirectory();
IndexWriter w = new IndexWriter(index, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
addDoc(w, "Lucene in Action");
addDoc(w, "Lucene for Dummies");
addDoc(w, "Managing Gigabytes");
addDoc(w, "The Art of Computer Science");
w.close();
addDoc() takes a string and adds it to the index:
private static void addDoc(IndexWriter w, String value) throws IOException {
Document doc = new Document();
doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
w.addDocument(doc);
}
}
2. Query
We read the query from stdin, parse it and build a lucene Query out of it.
String querystr = args.length > 0 ? args[0] : "lucene"; Query q = new QueryParser(Version.LUCENE_CURRENT, "title", analyzer).parse(querystr);
3. Search
Using the Query we create a Searcher to search the index. Then instantiate a TopScoreDocCollector to collect the top 10 scoring hits.
int hitsPerPage = 10; IndexSearcher searcher = new IndexSearcher(index, true); TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs;
4. Display
Now that we have results from our search, we display the results to the user.
System.out.println("Found " + hits.length + " hits.");
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("title"));
}
Here's the app in its entirety. Download HelloLucene.java
import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.*; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import java.io.IOException; public class HelloLucene { public static void main(String[] args) throws IOException, ParseException { // 0. Specify the analyzer for tokenizing text. // The same analyzer should be used for indexing and searching StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); // 1. create the index Directory index = new RAMDirectory(); // the boolean arg in the IndexWriter ctor means to // create a new index, overwriting any existing index IndexWriter w = new IndexWriter(index, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED); addDoc(w, "Lucene in Action"); addDoc(w, "Lucene for Dummies"); addDoc(w, "Managing Gigabytes"); addDoc(w, "The Art of Computer Science"); w.close(); // 2. query String querystr = args.length > 0 ? args[0] : "lucene"; // the "title" arg specifies the default field to use // when no field is explicitly specified in the query. Query q = new QueryParser( Version.LUCENE_CURRENT, "title", analyzer).parse(querystr); // 3. search int hitsPerPage = 10; IndexSearcher searcher = new IndexSearcher(index, true); TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; // 4. display results System.out.println("Found " + hits.length + " hits."); for(int i=0;i<hits.length;++i) { int docId = hits[i].doc; Document d = searcher.doc(docId); System.out.println((i + 1) + ". " + d.get("title")); } // searcher can only be closed when there // is no need to access the documents any more. searcher.close(); } private static void addDoc(IndexWriter w, String value) throws IOException { Document doc = new Document(); doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED)); w.addDocument(doc); } }
To use this app from the command line, type java HelloLucene <query>
If you're new to Java...
Try this:
curl "http://svn.apache.org/viewvc/lucene/solr/tags/release-1.4.0/lib/lucene-core-2.9.1.jar?view=co" > lucene-core-2.9.1.jar $curl "http://www.lucenetutorial.com/uploads/images/HelloLucene.java" > HelloLucene.java javac -classpath .:lucene-core-2.9.1.jar HelloLucene.java java -classpath .:lucene-core-2.9.1.jar HelloLucene
This should give you:
Found 2 hits. 1. Lucene in Action 2. Lucene for Dummies
Erik, a helpful reader explains:
The compilation went smoothly. However I did not manage to run the code. After some online searching and experimenting I found out that...including both . and the Lucene jarfile in the class path was crucial to get things working. This could be a useful tip for other beginners who, like me, do not have much experience with setting the Java class path.
Installing Lucene
PS: Its come to my attention that some visitors have difficulty installing Lucene in the first place.
You should first download Lucene and extract it to a directory you use for java or programming.If you're using Netbeans you can either:
- Follow the directions here
- Use these steps:
- Add the jar file to Netbeans as an external library by choosing 'Tools' on the menu bar and then selecting 'Library Manager'.
- Go to the project. Right click on the project you need to use Lucene for. Select 'Properties'.
- In the dialogue box, select 'Libraries' and then select the 'Add Jar/Folder' option.
- Navigate to the directory which was created from lucene-[version].tar.gz. Select lucene-core-[version].jar.
- Click 'OK' in the dialogue box. The jar file has now been added to your project.