Lucene中IndexSearcher的搜索方法没有返回任何输出。查询返回的文档数始终为0.我使用以下代码构建了索引:
void buildIndex(File indexDir, File trainDir, HashMap<String,Integer> dictionary)
throws IOException, FileNotFoundException {
Directory fsDir = FSDirectory.open(indexDir);
IndexWriterConfig iwConf
= new IndexWriterConfig(VERSION,mAnalyzer);
iwConf.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
IndexWriter indexWriter
= new IndexWriter(fsDir,iwConf);
File file = trainDir;
String csvFilename = "/home/serene/Downloads/IndustryClassification/Train/Training.csv";
CSVReader csvReader = new CSVReader(new FileReader(csvFilename),'\t');
String[] row = null;
while((row = csvReader.readNext()) != null) {
Document d = new Document();
String companyname = row[1];
String NAICSID = row[2];
//System.out.println(NAICSID);
String description = row[4];
d.add(new TextField("company",companyname,Store.YES));
d.add(new StringField("category",NAICSID,Store.YES));
dictionary.put(NAICSID, 1);
d.add(new TextField("description", description, Store.NO));
//System.out.println(d.toString());
indexWriter.addDocument(d);
}
csvReader.close();
int numDocs = indexWriter.numDocs();
indexWriter.forceMerge(1);
indexWriter.commit();
indexWriter.close();
System.out.println("index=" + indexDir.getName());
System.out.println("num docs=" + numDocs);
}
当尝试使用以下代码获取测试查询的输出时,我没有获得类别的任何输出,因为scoreDocs.length总是0并且for循环中的代码不会被执行。
void testIndex(File indexDir, File testDir, Set<String>NEWSGROUPS)
throws IOException, FileNotFoundException, ParseException {
Directory fsDir = FSDirectory.open(indexDir);
DirectoryReader reader = DirectoryReader.open(fsDir);
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(VERSION);
System.out.print("inside testIndex");
int[][] confusionMatrix
= new int[NEWSGROUPS.size()][NEWSGROUPS.size()];
String csvFilename = "/home/serene/Downloads/IndustryClassification/Test/Test.csv";
CSVReader csvReader = new CSVReader(new FileReader(csvFilename), '\t');
String[] row = null;
while((row = csvReader.readNext()) != null) {
String companyname = row[1];
String NAICSID = row[2];
String description = row[4];
Query query = new QueryParser(Version.LUCENE_44,"contents",analyzer).parse(QueryParser.escape(description));
System.out.print(query +"\n");
TopDocs hits = searcher.search(query,3);
ScoreDoc[] scoreDocs = hits.scoreDocs;
System.out.println(hits.totalHits);
for (int n = 0; n < scoreDocs.length; ++n) {
ScoreDoc sd = scoreDocs[n];
int docId = sd.doc;
Document d = searcher.doc(docId);
String category = d.get("category");
System.out.println(category);
}
}
csvReader.close();
}
答案 0 :(得分:0)
替换&#34;内容&#34;与您索引的任何字段(公司..)一起使用。