我尝试在 TITLE 列中匹配文本从ASA5505 8.2到ASA5516的配置迁移。
我的程序看起来像这样。
Directory directory = FSDirectory.open(indexDir);
MultiFieldQueryParser queryParser = new MultiFieldQueryParser(Version.LUCENE_35,new String[] {"TITLE"}, new StandardAnalyzer(Version.LUCENE_35));
IndexReader reader = IndexReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
queryParser.setPhraseSlop(0);
queryParser.setLowercaseExpandedTerms(true);
Query query = queryParser.parse("TITLE:Config migration from ASA5505 8.2 to ASA5516");
System.out.println(queryStr);
TopDocs topDocs = searcher.search(query,100);
System.out.println(topDocs.totalHits);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println(hits.length + " Record(s) Found");
for (int i = 0; i < hits.length; i++) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println("\"Title :\" " +d.get("TITLE") );
}
但它的回归
"Title :" Config migration from ASA5505 8.2 to ASA5516
"Title :" Firewall migration from ASA5585 to ASA5555
"Title :" Firewall migration from ASA5585 to ASA5555
第二个2结果是不可预期的。那么需要进行哪些修改才能匹配准确的文本从ASA5505 8.2到ASA5516的配置迁移
我的索引功能看起来像这样
public class Lucene {
public static final String INDEX_DIR = "./Lucene";
private static final String JDBC_DRIVER = "oracle.jdbc.OracleDriver";
private static final String CONNECTION_URL = "jdbc:oracle:thin:xxxxxxx"
private static final String USER_NAME = "localhost";
private static final String PASSWORD = "localhost";
private static final String QUERY = "select * from TITLE_TABLE";
public static void main(String[] args) throws Exception {
File indexDir = new File(INDEX_DIR);
Lucene indexer = new Lucene();
try {
Date start = new Date();
Class.forName(JDBC_DRIVER).newInstance();
Connection conn = DriverManager.getConnection(CONNECTION_URL, USER_NAME, PASSWORD);
SimpleAnalyzer analyzer = new SimpleAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
IndexWriter indexWriter = new IndexWriter(FSDirectory.open(indexDir), indexWriterConfig);
System.out.println("Indexing to directory '" + indexDir + "'...");
int indexedDocumentCount = indexer.indexDocs(indexWriter, conn);
indexWriter.close();
System.out.println(indexedDocumentCount + " records have been indexed successfully");
System.out.println("Total Time:" + (new Date().getTime() - start.getTime()) / (1000));
} catch (Exception e) {
e.printStackTrace();
}
}
int indexDocs(IndexWriter writer, Connection conn) throws Exception {
String sql = QUERY;
Statement stmt = conn.createStatement();
stmt.setFetchSize(100000);
ResultSet rs = stmt.executeQuery(sql);
int i = 0;
while (rs.next()) {
System.out.println("Addind Doc No:" + i);
Document d = new Document();
System.out.println(rs.getString("TITLE"));
d.add(new Field("TITLE", rs.getString("TITLE"), Field.Store.YES, Field.Index.ANALYZED));
d.add(new Field("NAME", rs.getString("NAME"), Field.Store.YES, Field.Index.ANALYZED));
writer.addDocument(d);
i++;
}
return i;
}
}
答案 0 :(得分:0)
尝试PhraseQuery
如下:
BooleanQuery mainQuery= new BooleanQuery();
String searchTerm="config migration from asa5505 8.2 to asa5516";
String strArray[]= searchTerm.split(" ");
for(int index=0;index<strArray.length;index++)
{
PhraseQuery query1 = new PhraseQuery();
query1.add(new Term("TITLE",strArray[index]));
mainQuery.add(query1,BooleanClause.Occur.MUST);
}
然后执行mainQuery
。
查看stackoverflow的this主题,它可以帮助您使用PhraseQuery
进行精确搜索。
答案 1 :(得分:0)
PVR是正确的,使用短语查询可能是正确的解决方案,但他们错过了如何使用PhraseQuery
类。您已经在使用QueryParser
了,所以只需使用引号中的搜索文本封闭查询解析器语法:
Query query = queryParser.parse("TITLE:\"Config migration from ASA5505 8.2 to ASA5516\"");
根据您的更新,您在索引时和查询时使用不同的分析器。 SimpleAnalyzer
和StandardAnalyzer
不做同样的事情。除非您有充分的理由不这样做,否则在索引和查询时应该以相同的方式进行分析。
因此,请将索引代码中的分析器更改为StandardAnalyzer
(反之亦然,查询时请使用SimpleAnalyzer
),您应该会看到更好的结果。
答案 2 :(得分:0)
以下是我为你所写的完美作品:
使用:int ret;
fd_set set;
struct timeval timeout;
/* Initialize the file descriptor set. */
FD_ZERO(&set);
FD_SET(recvFD, &set);
/* Initialize the timeout data structure. */
timeout.tv_sec = 30;
timeout.tv_usec = 0;
/* select returns 0 if timeout, 1 if input available, -1 if error. */
ret = select(recvFD+1, &set, NULL, NULL, &timeout));
if (ret == 1) {
num_bytes_received = recv(recvFD, line, MAX_LINE_SIZE-1, 0);
if(line[0] == 'R')
{
do_something();
}
if(line[0] == 'P')
{
do_another_thing();
}
}
else if (ret == 0) {
/* timeout */
do_another_thing();
}
else {
/* error handling */
}
创建索引
queryParser.parse("\"Config migration from ASA5505 8.2 to ASA5516\"");
}
2.搜索字符串
public static void main(String[] args)
{
IndexWriter writer = getIndexWriter();
Document doc = new Document();
Document doc1 = new Document();
Document doc2 = new Document();
doc.add(new Field("TITLE", "Config migration from ASA5505 8.2 to ASA5516",Field.Store.YES,Field.Index.ANALYZED));
doc1.add(new Field("TITLE", "Firewall migration from ASA5585 to ASA5555",Field.Store.YES,Field.Index.ANALYZED));
doc2.add(new Field("TITLE", "Firewall migration from ASA5585 to ASA5555",Field.Store.YES,Field.Index.ANALYZED));
try
{
writer.addDocument(doc);
writer.addDocument(doc1);
writer.addDocument(doc2);
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static IndexWriter getIndexWriter()
{
IndexWriter indexWriter=null;
try
{
File file=new File("D://index//");
if(!file.exists())
file.mkdir();
IndexWriterConfig conf=new IndexWriterConfig(Version.LUCENE_34, new StandardAnalyzer(Version.LUCENE_34));
Directory directory=FSDirectory.open(file);
indexWriter=new IndexWriter(directory, conf);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return indexWriter;
}