Lucene in 5 minutes

Now updated for Lucene 3.5.0!

Lucene makes it easy to add full-text search capability to your application. In fact, its so easy, I'm going to show you how in 5 minutes!

1. Index

For this simple case, we're going to create an in-memory index from some strings.

Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, analyzer);

IndexWriter w = new IndexWriter(index, config);
addDoc(w, "Lucene in Action");
addDoc(w, "Lucene for Dummies");
addDoc(w, "Managing Gigabytes");
addDoc(w, "The Art of Computer Science");
w.close();
addDoc() takes a string and adds it to the index:

private static void addDoc(IndexWriter w, String value) throws IOException {
    Document doc = new Document();
    doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
    w.addDocument(doc);
  }
}

addDoc() takes a string and adds it to the index:

private static void addDoc(IndexWriter w, String value) throws IOException {
    Document doc = new Document();
    doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
    w.addDocument(doc);
  }
}

 

2. Query

We read the query from stdin, parse it and build a lucene Query out of it.

String querystr = args.length > 0 ? args[0] : "lucene";
Query q = new QueryParser(Version.LUCENE_35, "title", analyzer).parse(querystr);

 

3. Search

Using the Query we create a Searcher to search the index. Then instantiate a TopScoreDocCollector to collect the top 10 scoring hits.

int hitsPerPage = 10;
IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;

 

4. Display

Now that we have results from our search, we display the results to the user.

System.out.println("Found " + hits.length + " hits.");
for(int i=0;i<hits.length;++i) {
    int docId = hits[i].doc;
    Document d = searcher.doc(docId);
    System.out.println((i + 1) + ". " + d.get("title"));
}



Here's the app in its entirety. Download HelloLucene.java

 

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;

import java.io.IOException;

public class HelloLucene {
  public static void main(String[] args) throws IOException, ParseException {
    // 0. Specify the analyzer for tokenizing text.
    //    The same analyzer should be used for indexing and searching
    StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);

    // 1. create the index
    Directory index = new RAMDirectory();

    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, analyzer);

    IndexWriter w = new IndexWriter(index, config);
    addDoc(w, "Lucene in Action");
    addDoc(w, "Lucene for Dummies");
    addDoc(w, "Managing Gigabytes");
    addDoc(w, "The Art of Computer Science");
    w.close();

    // 2. query
    String querystr = args.length > 0 ? args[0] : "lucene";

    // the "title" arg specifies the default field to use
    // when no field is explicitly specified in the query.
    Query q = new QueryParser(Version.LUCENE_35, "title", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 10;
    IndexReader reader = IndexReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(q, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;
   
    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("title"));
    }

    // searcher can only be closed when there
    // is no need to access the documents any more.
    searcher.close();
  }

  private static void addDoc(IndexWriter w, String value) throws IOException {
    Document doc = new Document();
    doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
    w.addDocument(doc);
  }
}
To use this app from the command line, type java HelloLucene <query>

Where to from here?

  1. Check out one of the books about Lucene below.
  2. Should you consider using Apache Solr instead of Apache Lucene?
  3. Learn more about basic Lucene concepts

Popular books related to Lucene and search

 
 
 
 

PS: If you're new to Java...

Try this:

wget http://repo1.maven.org/maven2/org/apache/lucene/lucene-core/3.4.0/lucene-core-3.4.0.jar
wget http://www.lucenetutorial.com/code/HelloLucene.java
javac -classpath .:lucene-core-3.4.0.jar HelloLucene.java
java -classpath .:lucene-core-3.4.0.jar HelloLucene

This should give you:

Found 2 hits.
1. Lucene in Action
2. Lucene for Dummies

Erik, a helpful reader explains:

The compilation went smoothly. However I did not manage to run the code. After some online searching and experimenting I found out that...including both . and the Lucene jarfile in the class path was crucial to get things working. This could be a useful tip for other beginners who, like me, do not have much experience with setting the Java class path.

Installing Lucene

PS: Its come to my attention that some visitors have difficulty installing Lucene in the first place.

You should first download Lucene and extract it to a directory you use for java or programming.

If you're using Netbeans you can either:

- Follow the directions here

- Use these steps:

  1. Add the jar file to Netbeans as an external library by choosing 'Tools' on the menu bar and then selecting 'Library Manager'.
  2. Go to the project. Right click on the project you need to use Lucene for. Select 'Properties'.
  3. In the dialogue box, select 'Libraries' and then select the 'Add Jar/Folder' option.
  4. Navigate to the directory which was created from lucene-[version].tar.gz. Select lucene-core-[version].jar.
  5. Click 'OK' in the dialogue box. The jar file has now been added to your project.

blog comments powered by Disqus