如何使用lucene中的tfidf similairty对文档进行排名

时间:2016-08-27 14:30:50

标签: java lucene

在创建索引和搜索查询的基本代码中,我想使用TFIDFsimilarity对检索到的文档进行排名。但我收到错误“无法实例化类型TFIDFSimilarity”。我的代码如下:

public class TFIDF_T {

private static Document createDocument(String id, String tb) {
    Document doc = new Document();
    doc.add(new Field("id", id, TextField.TYPE_STORED));
    doc.add(new Field("tb", tb, TextField.TYPE_STORED));
    return doc;
}

private static void search(IndexSearcher searcher, String queryString, Analyzer analyzer) throws IOException, ParseException {
    QueryParser parser = new QueryParser("tb", analyzer);
    Query query = parser.parse(queryString);
    ScoreDoc[] hits = searcher.search(query, 20).scoreDocs;
    int hitCount = hits.length;
    for (int i = 0; i < hitCount; i++) {
        Document doc = searcher.doc(hits[i].doc);
        System.out.println(doc.get("id"));
    }
}

public static void run(String path) throws IOException, ParseException {
    int ctr = 0;
    Analyzer analyzer = new StandardAnalyzer();
    RAMDirectory directory = new RAMDirectory();
    IndexWriterConfig config = new IndexWriterConfig(analyzer);
    IndexWriter iwriter = new IndexWriter(directory, config);
        String id = "101";
        String tb = "How to rank documents using tfidfsimilairty";
        iwriter.addDocument(createDocument(id, tb));
    }
    iwriter.close();
    DirectoryReader ireader = DirectoryReader.open(directory);
    IndexSearcher isearcher = new IndexSearcher(ireader);
    isearcher.setSimilarity(new **TFIDFSimilarity**());
    String tb1 = "How to rank documents using tfidfsimilairty";
    search(isearcher, tb1, analyzer);
    }
    ireader.close();
    directory.close();
  }
}

1 个答案:

答案 0 :(得分:1)

TFIDFSimilarity是一个抽象类。你无法实例化它。

ClassicSimilarity是它的一个实现(假设您使用的是5.4或更高版本,否则为DefaultSimilarity)。请改用它。