我想用方法存储和返回Lucene文档,以便我可以在另一个应用程序中使用它。
我的类文件中有两个方法:
1. resultSet
方法,返回搜索结果的Document
个对象数组。
使用以下代码:
public Document[] resultSet() throws IOException, Exception{
/********** HERE WE DO MAJOR PROCESSING CALL OF THE WRITER AND SEEARCHER ************/
TopDocs hits = null;
System.out.println("Am ahere");
// We set array of the document we returned
Document[] resultSet={};
// PROCESSING THE SEARCH FILES
// Before we process the index searcher we check
// The content of the docPath
if(docPath!=null && docPath.length()>4){
// PROCESSING THE INDEX WRITER
// Before we process the index writer we check
// The content of the indexPath
if(indexPath.length()>4 && indexPath!=null){ // Ensuring its a path or directory string
// Lets check if we have instruction to index or not
if(nio==1){
IndexFiles indexFile=new IndexFiles(indexPath, docPath, xfields, create);
// Here we get all Index File parameters and log it to our process logger method
indexStart=indexFile.start; // index Start Date
indexEnd=indexFile.end; // index End Date
message=indexFile.message; // Message log
// LETS CLOSE INDEXER
indexFile.close();
} // End of index option check
}
// NOW LETS CALL THE SEARCH FILES CLASS TO INSTANTIATE IT
searchStart=new Date(); // Search Start Date
SearchFiles searches=new SearchFiles(indexPath, toParam);
searchEnd=new Date(); // Search End Date
// BufferedReader
BufferedReader in = null;
boolean checkQ=false;
// Lets check if query is a file
File cfile=new File(queryX);
// Now lets check
if(cfile.isFile()){
// We process queryX as a file
in = Files.newBufferedReader(Paths.get(queryX), StandardCharsets.UTF_8);
checkQ=true;
}
else{
checkQ=false;
}
// Here we are going to select the data we use for line
String line = checkQ != true ? queryX : in.readLine();
// Now lets trim the line
line = line.trim();
// Now lets search the index
hits=searches.search(line);
// NOW LETS GET THE TOTAL HITS
totalHits=hits.totalHits;
/*************** WE TRY TO PROCESS HITS INTO DOCUMENTS ***************/
ScoreDoc[] searched=searchFetched(hits);
int increment=0;
// Now we call the Document to get document
for(ScoreDoc scoreDoc:searched){
// Get document
Document doc=searches.getDocument(scoreDoc);
// Now lets add to resultset
resultSet[increment]=doc;
increment++;
} // End of loop
// LETS CLOSE THE SEARCHER
searches.close();
// End of DocPath Check
}
// NOW LETS RETURN THE HITS
return resultSet;
// End of method
}
searchFetched
返回ScoreDocs
方法使用的resultSet
:
private ScoreDoc[] searchFetched(TopDocs hits) throws IOException, Exception{
// Lets set the array to hold our scores
// NOW LETS RETURN SCORES
return hits.scoreDocs;
}
这是我的主要方法,我试图显示存储在数组中的返回文档的输出:
public static void main(String[] args){
/***** HERE WE PROCESS THE METHODS IN THE CLASS ********/
// Setting Object Variables
String xFiles="{indexDir:cores/core/testData/indexdir,docDir:cores/core/testData/datadir,nio:1}";
String xParams="{update:false,xfields:sender*receiver*subject,queryX:Job openings,[f>subject-h>10-m>100-n>0-r>true]}";
// Setting new constructor of this method
SearchHandle handles=new SearchHandle(xFiles, xParams);
// Now we can call other methods in the Search handler class
try {
// Now lets fetch data
Document[] rows=handles.resultSet();
System.out.println(Arrays.toString(rows));
System.out.println(handles.totalHits);
// Now we can loop to display the result of the searched
for(Document row:rows){
// Now we make use of scoreDoc
System.out.println("File: " +row.get("path"));
} // End of loop
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Am no more getting error, the problem now is that am not getting any hits even when the document are indexed.
I also found writer.lock in the index directory. What could be the cause of zero hits
使用当前结果进行编辑
没有更多的错误。我的indexFile工作及其索引文档。
问题是,当我搜索索引文档时,我无法获得任何点击。
这是我的indexFile Code
:
package com.***.***.handlers.searchHandler;
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Arrays;
import java.util.Date;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.*;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.LongPoint;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.index.Term;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
/** Index all text files under a directory.
*
* This is a universal text index java application that can be used on Djade
* And other software related application
*/
public class IndexFiles {
// Creating public variables to use
public Date start;
public Date end;
public String message="";
private IndexWriter writer;
private static String docType;
// Now Construct the class
public IndexFiles(String indexPath, String xdocs, String xfields, boolean create) {
// Lets declare local variable
String docsPath="";
String xType="";
String xValues="";
/************ HERE WE PROCESS THE XDOCS STRING TO KNOW THE TYPE OF DATA **********/
String[] xArray=xdocs.split("@");
// Lets get count
int xCount=xArray.length;
// NOW LETS CHECK COUNT TO LOOP
if(xCount>0){
// We the assign values to each and check
xType=xArray[0];
xValues=xArray[1];
// Now We assign file string to the docsPath
docsPath=xValues;
// Now we check Xtype value to assign type appropriately
if(xType.equals(new String("as"))){
// We set type to array String
docType="arrayFile";
}
else if(xType.equals(new String("of"))){
// We set type to normal file
docType="normalFile";
}
} // End of count check
final Path docDir = Paths.get(docsPath);
if (!Files.isReadable(docDir)) {
message+="Document directory '" +docDir.toAbsolutePath()+ "' does not exist or is not readable, please check the path \n";
System.exit(1);
}
start = new Date();
try {
message+="Indexing to directory '" + indexPath + "'... \n";
Directory dir = FSDirectory.open(Paths.get(indexPath));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
if (create) {
// Create a new index in the directory, removing any
// previously indexed documents:
iwc.setOpenMode(OpenMode.CREATE);
} else {
// Add new documents to an existing index:
iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
}
// Optional: for better indexing performance, if you
// are indexing many documents, increase the RAM
// buffer. But if you do this, increase the max heap
// size to the JVM (eg add -Xmx512m or -Xmx1g):
//
// iwc.setRAMBufferSizeMB(256.0);
writer = new IndexWriter(dir, iwc);
indexDocs(writer, docDir, xfields);
// NOTE: if you want to maximize search performance,
// you can optionally call forceMerge here. This can be
// a terribly costly operation, so generally it's only
// worth it when your index is relatively static (ie
// you're done adding documents to it):
//
// writer.forceMerge(1);
end = new Date();
message+=end.getTime() - start.getTime() + " total milliseconds \n";
} catch (IOException e) {
message+=" caught a " + e.getClass() +
"\n with message: " + e.getMessage()+" \n";
}
}
/** Index all text files under a directory. */
public void close() throws IOException{
writer.close();
}
/**
* Indexes the given file using the given writer, or if a directory is given,
* recurses over files and directories found under the given directory.
*
* NOTE: This method indexes one document per input file. This is slow. For good
* throughput, put multiple documents into your input file(s). An example of this is
* in the benchmark module, which can create "line doc" files, one document per line,
* using the
* <a href="../../../../../contrib-benchmark/org/apache/lucene/benchmark/byTask/tasks/WriteLineDocTask.html"
* >WriteLineDocTask</a>.
*
* @param writer Writer to the index where the given file/dir info will be stored
* @param path The file to index, or the directory to recurse into to find files to index
* @throws IOException If there is a low-level I/O error
* System.out.println(file);
*/
static void indexDocs(final IndexWriter writer, Path path, String fields) throws IOException {
if (Files.isDirectory(path)) {
Files.walkFileTree(path, new SimpleFileVisitor<Path>() {
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
try {
indexDoc(writer, file, fields, attrs.lastModifiedTime().toMillis());
} catch (IOException ignore) {
// don't index files that can't be read.
}
return FileVisitResult.CONTINUE;
}
});
} else {
indexDoc(writer, path, fields, Files.getLastModifiedTime(path).toMillis());
}
}
/** Indexes a single document */
static void indexDoc(IndexWriter writer, Path file, String fields, long lastModified) throws IOException {
try (InputStream stream = Files.newInputStream(file)) {
// make a new, empty document
Document doc = new Document();
// Creating a string array
String[] contentArray = null;
String[] prefixArray = null;
// Array list variable
List<String> prefixList=new ArrayList<String>();
List<String> contentList=new ArrayList<String>();
// Other variable parts
String[] fieldArray;
String[] fieldValidType={"pdf", "xml", "html"};
String data="";
BufferedReader fin = null;
String fLine="";
// Checking if field is string of a file
File field=new File(fields);
String meta="";
String metaType="";
String typeVal="";
String[] metaData;
String[] typeSplit;
String ffields="";
// Add the path of the file as a field named "path". Use a
// field that is indexed (i.e. searchable), but don't tokenize
// the field into separate words and don't index term frequency
// or positional information:
Field pathField = new StringField("path", file.toString(), Field.Store.YES);
doc.add(pathField);
// Add the last modified date of the file a field named "modified".
// Use a LongPoint that is indexed (i.e. efficiently filterable with
// PointRangeQuery). This indexes to milli-second resolution, which
// is often too fine. You could instead create a number based on
// year/month/day/hour/minutes/seconds, down the resolution you require.
// For example the long value 2011021714 would mean
// February 17, 2011, 2-3 PM. System.out.println(lastModified);
doc.add(new LongPoint("modified", lastModified));
// Add the contents of the file to a field named "contents". Specify a Reader,
// so that the text of the file is tokenized and indexed, but not stored.
// Note that FileReader expects the file to be in UTF-8 encoding.
// If that's not the case searching for special characters will fail.
// WE READ AND STORE FILE IN DATA BEFORE STORING
BufferedReader br=new BufferedReader(new InputStreamReader(stream));
String strLine;
String contentData="";
// Now lets loop
while((strLine=br.readLine())!=null){
// Now lets now
contentData+="\n"+strLine;
}
// Now lets read line of content
doc.add(new TextField("contents", contentData, Field.Store.YES));
/************ HERE WE TRY TO ADD A UNIQUE FIELDS SENT THROUGH THE XFIELD IF XFIELD IS
* NOT NULL AND WE MAKE IT ALL A TEXTFIELD FIELD TYPE
*/
if(fields!=null){
// THEN WE ARE TO CREATE DYNAMIC FIELDS
// Lets process the stream data
BufferedReader fileData=new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8));
// LETS CHECK THE DOCTYPE VARIABLE
if(docType.equals(new String("arrayFile"))){
/******** We process as an array file to add fields ******/
// Now lets try to convert file data to array again
while((data=fileData.readLine())!=null){
// NOW LETS READ FILE DATA TO CONVERT TO ARRAY
Pattern pat = Pattern.compile("([^<]+)?(<as:(.*?)s>)?");
// Calling the matcher
Matcher m = pat.matcher(data);
while (m.find()) {
String contents = m.group(1);
String prefix = m.group(3);
if (prefix != null) { prefixList.add(prefix); }
if (contents != null) { contentList.add(contents); }
} // End of while loop
/********* NOW LETS COMPOSE INTO AN ARRAY ***************/
contentArray=new String[contentList.size()];
prefixArray=new String[prefixList.size()];
// Now lets compose to array
contentArray=contentList.toArray(contentArray);
prefixArray=prefixList.toArray(prefixArray);
} // End of while loop
/************ NOW WE CAN CREATE DYNAMIC FIELDS *************/
// Checking
if(field.isFile()){
// We read the field file to get all the fields
fin=Files.newBufferedReader(Paths.get(fields), StandardCharsets.UTF_8);
// Now lets get file data line by line
fLine=fin.readLine();
/******* Now we can process the field data *****/
fieldArray=fLine.split(";");
// Lets check count
if(fieldArray.length>0){
// We keep processing
meta=fieldArray[0];
ffields=fieldArray[1];
// Now lets validate the field data file
// We get the meta type
metaData=meta.split(",");
// Now lets get type
metaType=metaData[1];
// Now lets get the type value fieldValidType
typeSplit=metaType.split("-");
// NOW LETS CHECK IF TYPE IS IN ARRAY
typeVal=typeSplit[1];
/********* Now lets check if type exists in array **********/
if(Arrays.asList(fieldValidType).contains(typeVal)){
// ARRAY CONTAINS TYPE SO LETS PROCEED
String[] fieldsData=ffields.split(":");
// We further split fields data by comma
String fDatas=fieldsData[1];
// Further split
String[] fd=fDatas.split(",");
/***** Lets loop field array create the fields ******/
if(fd.length>0){
for(int i=0; i<=fd.length; i++){
/*********** We do a bit inner loop to check if field matches *********/
for(String prefix:prefixArray){
// Now lets check before we create
if(fd[i]==prefix){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fd[i], contentArray[i], Field.Store.YES);
doc.add(dynamicField);
} // End of if
} // End of foreach loop
} // End of loop
} // End of check
}
else{
// WHEN TYPE DOESNT EXIST WE LOG MESSAGE
// Just do nothing here
}
}
}
else{
// We assume that field is a string so we process as a string
// WE PROCESS FIELD STRING TO GET VALUES
int fieldIndex=fields.indexOf("*"); // Setting index value
if(fieldIndex>=0){
// Now lets split
fieldArray=fields.split("\\*");
// Lets check count and loop
if(fieldArray.length>0){
// We loop individual fields
for(int i=0; i<=fieldArray.length; i++){
// Now lets further process
/*********** We do a bit inner loop to check if field matches *********/
for(String prefix:prefixArray){
// Now lets check before we create
if(fieldArray[i]==prefix){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fieldArray[i], contentArray[i], Field.Store.YES);
doc.add(dynamicField);
} // End of if
} // End of foreach loop
} // End of for loop
} // End of count check
}
else{
// Setting a counter
int counter=0;
// We handle the values straight without loop
for(String prefix:prefixArray){
// Now lets check before we create
if(fields==prefix){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fields, contentArray[counter], Field.Store.YES);
doc.add(dynamicField);
counter++; // Increment counter
} // End of if
} // End of foreach loop
}
}
}
else if(docType.equals(new String("normalFile"))){
/******** We process as a normal file to add fields ******/
// WE PROCESS FILE TO GET EACH LINES
// Now lets try to convert file data to array again
while((data=fileData.readLine())!=null){
// We check if there there
fieldAdder(data, doc, fields);
} // end of while loop
}
}
if (writer.getConfig().getOpenMode() == OpenMode.CREATE) {
// New index, so we just add the document (no old document can be there):
// System.out.println("adding " + file);
writer.addDocument(doc);
} else {
// Existing index (an old copy of this document may have been indexed) so
// we use updateDocument instead to replace the old one matching the exact
// path, if present:
// System.out.println("updating " + file);
writer.updateDocument(new Term("path", file.toString()), doc);
}
}
}
/** CREATING A METHOD FOR CREATING DYNAMIC FIELDS **/
private static void fieldAdder(String fileContent, Document doc, String fields){
/************* CREATING VARIABLES FOR THIS METHOD *******************/
try{
// Other variable parts
String[] fieldArray;
String[] fieldValidType={"pdf", "xml", "html"};
BufferedReader fin = null;
String fLine="";
// Checking if field is string of a file
File field=new File(fields);
String meta="";
String metaType="";
String typeVal="";
String[] metaData;
String[] typeSplit;
String ffields="";
int indexOnContent=0;
// Checking
if(field.isFile()){
// We read the field file to get all the fields
fin=Files.newBufferedReader(Paths.get(fields), StandardCharsets.UTF_8);
// Now lets get file data line by line
fLine=fin.readLine();
/******* Now we can process the field data *****/
fieldArray=fLine.split(";");
// Lets check count
if(fieldArray.length>0){
// We keep processing
meta=fieldArray[0];
ffields=fieldArray[1];
// Now lets validate the field data file
// We get the meta type
metaData=meta.split(",");
// Now lets get type
metaType=metaData[1];
// Now lets get the type value fieldValidType
typeSplit=metaType.split("-");
// NOW LETS CHECK IF TYPE IS IN ARRAY
typeVal=typeSplit[1];
/********* Now lets check if type exists in array **********/
if(Arrays.asList(fieldValidType).contains(typeVal)){
// ARRAY CONTAINS TYPE SO LETS PROCEED
String[] fieldsData=ffields.split(":");
// We further split fields data by comma
String fDatas=fieldsData[1];
// Further split
String[] fd=fDatas.split(",");
/***** Lets loop field array create the fields ******/
if(fd.length>0){
for(int i=0; i<=fd.length; i++){
/*********** Check if index exist *********/
indexOnContent=fileContent.indexOf(fd[i]);
// Now lets check before we create
if(indexOnContent>0){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fd[i], fileContent, Field.Store.YES);
doc.add(dynamicField);
} // End of if
} // End of loop
} // End of check
}
else{
// WHEN TYPE DOESNT EXIST WE LOG MESSAGE
// Just do nothing here
}
}
}
else{
// We assume that field is a string so we process as a string
// WE PROCESS FIELD STRING TO GET VALUES
int fieldIndex=fields.indexOf("*"); // Setting index value
if(fieldIndex>0){
// Now lets split
fieldArray=fields.split("\\*");
// Lets check count and loop
if(fieldArray.length>0){
// We loop individual fields
for(int i=0; i<=((fieldArray.length)-1); i++){
// Now lets further process
/*********** Check if index exist *********/
indexOnContent=fileContent.indexOf(fieldArray[i]);
// Now lets check before we create
if(indexOnContent>=0){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fieldArray[i], fileContent, Field.Store.YES);
doc.add(dynamicField);
} // End of if
} // End of for loop
} // End of count check
}
else{
// We handle the values straight without loop
indexOnContent=fileContent.indexOf(fields);
// Now lets check before we create
if(indexOnContent>0){ // We create appropriately
// NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
Field dynamicField = new TextField(fields, fileContent, Field.Store.YES);
doc.add(dynamicField);
} // End of if
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// END OF METHOD
}
// END OF CLASS
}
索引后,我在索引目录中找到writer.lock
文件。我不知道它是否是问题的原因。
一切似乎都好。我只是不知道零点击的原因是什么。
答案 0 :(得分:1)
NullPointerException
的行。而且你是唯一拥有完整代码和行号的人...... searchFetched
clone()
克隆数组或将其复制Arrays.copyOf(T[], int)