如何使用lucene API搜索精确短语的内容?

时间:2014-04-17 05:37:53

标签: java lucene

输入搜索词:Adil Shahi王朝

  1. Adil Shahi王朝
  2. Qutb Shahi王朝
  3. Gohar Shahi模板
  4. 当我进入Adil Shahi王朝时,它会返回许多内容,我使用lucene API并希望将内容与精确短语匹配 代码:用于创建索引

    public static void main(String[] args) throws Exception{
         StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
         PhraseQuery query = new PhraseQuery();
        Directory index = FSDirectory.open(new File("/ttlfiles/indexes/category_labels_en"));
        BufferedReader br = new BufferedReader(
                new InputStreamReader(System.in));
        String querystr = br.readLine();
        while(!querystr.equals("q")){
        Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);
    
        // 3. search
        int hitsPerPage = 10;
        IndexReader reader = DirectoryReader.open(index);
        IndexSearcher searcher = new IndexSearcher(reader);
        TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
        searcher.search(q, collector);
        ScoreDoc[] hits = collector.topDocs().scoreDocs;
    
        // 4. display results
        System.out.println("Found " + hits.length + " hits.");
        for(int i=0;i<hits.length;++i) {
          int docId = hits[i].doc;
          Document d = searcher.doc(docId);
          System.out.println((i + 1) + ". " + d.get("spa"));
        }//end of for loop
        querystr = br.readLine();
        }//while's end
    }
    

3 个答案:

答案 0 :(得分:2)

@Gimby:用户可能选择了错误的代码来通过Lucene搜索内容。您必须先创建Lucene索引,然后才能搜索内容。

答案 1 :(得分:2)

以下是您可以参考搜索内容的代码:

public static void main(String[] args) throws Exception{
     StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
     //PhraseQuery query = new PhraseQuery();
    Directory index = FSDirectory.open(new File("/media/New Volume/ttlindexes"));
    BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
    String querystr = br.readLine();
    while(!querystr.equals("q")){
        QueryParser parser = new QueryParser(Version.LUCENE_47,"spo",analyzer);
        parser.setDefaultOperator(QueryParser.Operator.OR);
        //parser.setPhraseSlop(0);
        Query query=parser.createPhraseQuery("spo",querystr);
    //Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 1000000;
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(query, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("spo"));
    }//end of for loop
    querystr = br.readLine();
    }//while's end
}

答案 2 :(得分:1)

@Aadil:感谢您的指导,我在对dbpedia的ttl文件进行索引更改后使用了它。 您可以从此链接http://wiki.dbpedia.org/Downloads39下载龟文件,并且可以获取。