Lucene建议(SUGGEST_MORE_POPULAR标志的行为)

时间:2016-08-10 16:44:13

标签: java lucene

我想使用Lucene建议机制来帮助最终用户找出他输错的时间。

Lucene的SpellChecker有一个方法suggestSimilar,它应该接收一个SuggestionMode标志。使用标志SuggestMode.SUGGEST_MORE_POPULAR,我希望只提供当前目录中更多的单词建议。

以下代码似乎不同意这一假设:

import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.spell.LuceneDictionary;
import org.apache.lucene.search.spell.SpellChecker;
import org.apache.lucene.search.spell.SuggestMode;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

import java.io.IOException;
import java.util.LinkedList;
import java.util.List;

public class SuggestTest {

    static public void main(String args[]) throws IOException {

        final String NAME_FIELD = "NAME";

        Directory directory = new RAMDirectory();
        IndexWriter writer = new IndexWriter(directory,
                new IndexWriterConfig(new SimpleAnalyzer()));
        writer.deleteAll();
        writer.commit();

        List<String> list = new LinkedList<>();

        for (int i = 0; i < 1000; i++)
            list.add("wafa");

        list.add("waffa");

        for (String name : list) {
            Document doc = new Document();
            doc.add(new TextField(NAME_FIELD, name, Field.Store.YES));
            writer.addDocument(doc);
        }

        writer.close();
        DirectoryReader directoryReader = DirectoryReader.open(directory);


        LuceneDictionary nameDictionary = new LuceneDictionary(directoryReader, NAME_FIELD);

        IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());

        SpellChecker spellChecker = new SpellChecker(directory);
        spellChecker.indexDictionary(nameDictionary, config, true);

        for (String s : new String[]{"wafa", "waffa", "wala"}) {
            String suggestions[] = spellChecker.suggestSimilar(s, 10, null, null, SuggestMode.SUGGEST_MORE_POPULAR);
            System.out.println("Suggestions for " + s);
            for (String suggestion : suggestions)
                System.out.println(" -" + suggestion);
        }
    }
}

当我正在寻找Waffa时,我不希望以下代码向我建议Wafa(目录中发生了1000次!)

0 个答案:

没有答案