Lucene - 在搜索时使用*时荧光笔抛出异常

时间:2014-03-31 16:39:40

标签: lucene lucene-highlighter

我正在使用Lucene 4.6.1和Highlighter 4.6.0。由于索引工作正常,我只是要显示我的搜索代码:

    ... code to get all the fields' name/values, numDocs, etc.
    ...
    // Create Query and search 

    try {
        TopScoreDocCollector collector = TopScoreDocCollector.create(numDocs, true);
        Query q = MultiFieldQueryParser.parse(Version.LUCENE_40, searchTerms, fields, analyzer);
        searcher.search(q, collector);
        ScoreDoc[] hits = collector.topDocs().scoreDocs;
        Highlighter highlighter = new Highlighter(new QueryScorer(q));
        highlighter.setTextFragmenter(new SimpleFragmenter(40));
        int maxNumFragmentsRequired = 2;

        System.out.println("Found " + hits.length + " hits.");
        for(int i=0;i<hits.length;++i) {
            int docId = hits[i].doc;
            Document d = searcher.doc(docId);
            for(int j=0; j<fields.length; j++) {
                if(d.get(fields[j]) != null) {
                    String fieldText = d.get(fields[j]).trim();
                    TokenStream tokenStream = analyzer.tokenStream(fields[j], new StringReader(fieldText));

                    // Create String without the highlighted term
                    String unhighlighted = (i + 1) + ". "+fields[j]+ " "+ d.get(fields[j]).trim() + "<br>";

                    // Create the highlighted term
                    String highlighted = highlighter.getBestFragments(tokenStream, fieldText, maxNumFragmentsRequired, "...");

                    // If the highlighted term really exists
                    if(!highlighted.equals("")) 
                        unhighlighted = (i + 1) + ". "+fields[j]+ " "+ highlighted + "<br>";

                    response += unhighlighted;
                }
            }
        }

    } catch (Exception e) {
        System.out.println("Error searching " + searchTerm + " : " + e.getMessage());
    }

    System.out.println(response);
}

例如:在我的索引中,我得到了许多名为“Process 001”,“Process 002”,“Process 003”等的文档。如果我尝试搜索:进程,我可以检索所有进程(这是完美的工作!)。当我尝试搜索时发生问题:proc *,或:pr *,或类似的东西......错误在这里:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/lucene/queries/CommonTermsQuery
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:149)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:99)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:474)
at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:217)
at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186)
at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:197)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:460)
at freedom.lucene.service.LuceneTestApplication.search(LuceneTestApplication.java:406)
at freedom.lucene.service.LuceneTestApplication.main(LuceneTestApplication.java:75)
Caused by: java.lang.ClassNotFoundException: org.apache.lucene.queries.CommonTermsQuery
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 10 more

此行发生异常:

String highlighted = highlighter.getBestFragments(tokenStream, fieldText, maxNumFragmentsRequired, "...");

如果我删除了荧光笔代码,则搜索可以正常使用*

2 个答案:

答案 0 :(得分:1)

lucene-queries-4.6.1.jar添加到您的类路径中。

CommonTermsQuery未包含在lucene-core jar中。

答案 1 :(得分:0)

Relatedness添加到您的外部库