Lucene in 5 minutes
Now updated for Lucene 4.0!
Lucene makes it easy to add full-text search capability to your application. In fact, its so easy, I'm going to show you how in 5 minutes!
1. Index
For this simple case, we're going to create an in-memory index from some strings.
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "Lucene in Action", "193398817");
addDoc(w, "Lucene for Dummies", "55320055Z");
addDoc(w, "Managing Gigabytes", "55063554A");
addDoc(w, "The Art of Computer Science", "9900333X");
w.close();
addDoc() is what actually adds documents to the index:
Document doc = new Document();
doc.add(new TextField("title", title, Field.Store.YES));
doc.add(new StringField("isbn", isbn, Field.Store.YES));
w.addDocument(doc);
}
Note the use of TextField for content we want tokenized, and StringField for id fields and the like, which we don't want tokenized.
2. Query
We read the query from stdin, parse it and build a lucene Query out of it.
Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr);
3. Search
Using the Query we create a Searcher to search the index. Then a TopScoreDocCollector is instantiated to collect the top 10 scoring hits.
IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
4. Display
Now that we have results from our search, we display the results to the user.
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("isbn") + "\t" + d.get("title"));
}
Here's the app in its entirety. Download HelloLucene.java
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import java.io.IOException;
public class HelloLucene {
public static void main(String[] args) throws IOException, ParseException {
// 0. Specify the analyzer for tokenizing text.
// The same analyzer should be used for indexing and searching
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
// 1. create the index
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "Lucene in Action", "193398817");
addDoc(w, "Lucene for Dummies", "55320055Z");
addDoc(w, "Managing Gigabytes", "55063554A");
addDoc(w, "The Art of Computer Science", "9900333X");
w.close();
// 2. query
String querystr = args.length > 0 ? args[0] : "lucene";
// the "title" arg specifies the default field to use
// when no field is explicitly specified in the query.
Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr);
// 3. search
int hitsPerPage = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
// 4. display results
System.out.println("Found " + hits.length + " hits.");
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("isbn") + "\t" + d.get("title"));
}
// reader can only be closed when there
// is no need to access the documents any more.
reader.close();
}
private static void addDoc(IndexWriter w, String title, String isbn) throws IOException {
Document doc = new Document();
doc.add(new TextField("title", title, Field.Store.YES));
// use a string field for isbn because we don't want it tokenized
doc.add(new StringField("isbn", isbn, Field.Store.YES));
w.addDocument(doc);
}
}
Where to from here?
- Check out one of the books about Lucene below.
- Should you consider using Apache Solr instead of Apache Lucene?
- Learn more about basic Lucene concepts
Popular books related to Lucene and search
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Mavenized GitHub repo
Courtesy of Mac Luq, a GitHub repo with Mavenized source is available here: https://github.com/macluq/helloLucene.
Get it with:
PS: If you're new to Java...
Try this:
wget http://repo1.maven.org/maven2/org/apache/lucene/lucene-analyzers-common/4.0.0/lucene-analyzers-common-4.0.0.jar
wget http://repo1.maven.org/maven2/org/apache/lucene/lucene-queryparser/4.0.0/lucene-queryparser-4.0.0.jar
wget http://www.lucenetutorial.com/code/HelloLucene.java
javac -classpath .:lucene-core-4.0.0.jar:lucene-analyzers-common-4.0.0.jar:lucene-queryparser-4.0.0.jar HelloLucene.java
java -classpath .:lucene-core-4.0.0.jar:lucene-analyzers-common-4.0.0.jar:lucene-queryparser-4.0.0.jar HelloLucene
This should give you:
1. Lucene in Action
2. Lucene for Dummies
Erik, a helpful reader explains:
The compilation went smoothly. However I did not manage to run the code. After some online searching and experimenting I found out that...including both . and the Lucene jarfile in the class path was crucial to get things working. This could be a useful tip for other beginners who, like me, do not have much experience with setting the Java class path.
Installing Lucene
PS: Its come to my attention that some visitors have difficulty installing Lucene in the first place.
You should first download Lucene and extract it to a directory you use for java or programming.If you're using Netbeans you can either:
- Follow the directions here
- Use these steps:
- Add the jar file to Netbeans as an external library by choosing 'Tools' on the menu bar and then selecting 'Library Manager'.
- Go to the project. Right click on the project you need to use Lucene for. Select 'Properties'.
- In the dialogue box, select 'Libraries' and then select the 'Add Jar/Folder' option.
- Navigate to the directory which was created from lucene-[version].tar.gz. Select lucene-core-[version].jar.
- Click 'OK' in the dialogue box. The jar file has now been added to your project.
blog comments powered by Disqus







