Lucene:当文档成功编入索引时,无法从lucene索引返回任何命中

时间:2017-06-12 23:03:29

标签: java eclipse lucene

我想用方法存储和返回Lucene文档,以便我可以在另一个应用程序中使用它。

我的类文件中有两个方法: 1. resultSet方法,返回搜索结果的Document个对象数组。 使用以下代码:

public Document[] resultSet() throws IOException, Exception{
        /********** HERE WE DO MAJOR PROCESSING CALL OF THE WRITER AND SEEARCHER ************/
        TopDocs hits = null;
        System.out.println("Am ahere");
        // We set array of the document we returned
        Document[] resultSet={};
        // PROCESSING THE SEARCH FILES
        // Before we process the index searcher we check
        // The content of the docPath
        if(docPath!=null && docPath.length()>4){
        // PROCESSING THE INDEX WRITER
        // Before we process the index writer we check
        // The content of the indexPath
        if(indexPath.length()>4 && indexPath!=null){ // Ensuring its a path or directory string
        // Lets check if we have instruction to index or not
        if(nio==1){
        IndexFiles indexFile=new IndexFiles(indexPath, docPath, xfields, create);
        // Here we get all Index File parameters and log it to our process logger method
        indexStart=indexFile.start; // index Start Date
        indexEnd=indexFile.end; // index End Date
        message=indexFile.message; // Message log
        // LETS CLOSE INDEXER
        indexFile.close();
        } // End of index option check
        }
        // NOW LETS CALL THE SEARCH FILES CLASS TO INSTANTIATE IT
        searchStart=new Date(); // Search Start Date
        SearchFiles searches=new SearchFiles(indexPath, toParam);
        searchEnd=new Date(); // Search End Date
        // BufferedReader
        BufferedReader in = null;
        boolean checkQ=false;
        // Lets check if query is a file
        File cfile=new File(queryX);
        // Now lets check
        if(cfile.isFile()){
        // We process queryX as a file
        in = Files.newBufferedReader(Paths.get(queryX), StandardCharsets.UTF_8);
        checkQ=true;
        }
        else{
        checkQ=false;
        }


        // Here we are going to select the data we use for line
        String line = checkQ != true ? queryX : in.readLine();
        // Now lets trim the line

        line = line.trim();

        // Now lets search the index
        hits=searches.search(line);
        // NOW LETS GET THE TOTAL HITS
        totalHits=hits.totalHits;

        /*************** WE TRY TO PROCESS HITS INTO DOCUMENTS ***************/
        ScoreDoc[] searched=searchFetched(hits);
        int increment=0;
        // Now we call the Document to get document
        for(ScoreDoc scoreDoc:searched){
        // Get document 
        Document doc=searches.getDocument(scoreDoc);
        // Now lets add to resultset
        resultSet[increment]=doc;
        increment++;
        } // End of loop

        // LETS CLOSE THE SEARCHER
        searches.close();
        // End of DocPath Check
        }

        // NOW LETS RETURN THE HITS
        return resultSet;

     // End of method     
    }
  1. searchFetched返回ScoreDocs方法使用的resultSet

    private ScoreDoc[] searchFetched(TopDocs hits) throws IOException, Exception{
    // Lets set the array to hold our scores
    
    // NOW LETS RETURN SCORES
    return hits.scoreDocs;
    

    }

  2. 这是我的主要方法,我试图显示存储在数组中的返回文档的输出:

    public static void main(String[] args){
                /***** HERE WE PROCESS THE METHODS IN THE CLASS ********/
                // Setting Object Variables
                String xFiles="{indexDir:cores/core/testData/indexdir,docDir:cores/core/testData/datadir,nio:1}";
                String xParams="{update:false,xfields:sender*receiver*subject,queryX:Job openings,[f>subject-h>10-m>100-n>0-r>true]}";
                // Setting new constructor of this method
                SearchHandle handles=new SearchHandle(xFiles, xParams);
                // Now we can call other methods in the Search handler class
                try {
                    // Now lets fetch data
                    Document[] rows=handles.resultSet();
                    System.out.println(Arrays.toString(rows));
                    System.out.println(handles.totalHits);
                    // Now we can loop to display the result of the searched
                    for(Document row:rows){
                        // Now we make use of scoreDoc
                        System.out.println("File: " +row.get("path"));
                    } // End of loop
                } catch (Exception e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }
    

    Am no more getting error, the problem now is that am not getting any hits even when the document are indexed. I also found writer.lock in the index directory. What could be the cause of zero hits

    使用当前结果进行编辑 没有更多的错误。我的indexFile工作及其索引文档。 问题是,当我搜索索引文档时,我无法获得任何点击。 这是我的indexFile Code

    package com.***.***.handlers.searchHandler;
    
    
    import java.io.BufferedReader;
    import java.io.File;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.nio.charset.StandardCharsets;
    import java.nio.file.FileVisitResult;
    import java.nio.file.Files;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    import java.nio.file.SimpleFileVisitor;
    import java.nio.file.attribute.BasicFileAttributes;
    import java.util.Arrays;
    import java.util.Date;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    import java.util.*;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.document.LongPoint;
    import org.apache.lucene.document.StringField;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.index.IndexWriterConfig.OpenMode;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.FSDirectory;
    
    /** Index all text files under a directory.
     *
     * This is a universal text index java application that can be used on Djade 
     * And other software related application 
     */
    public class IndexFiles {
      // Creating public variables to use
        public Date start;
        public Date end;
        public String message="";
        private IndexWriter writer;
        private static String docType;
     // Now Construct the class 
      public IndexFiles(String indexPath, String xdocs, String xfields, boolean create) {
    
          // Lets declare local variable
          String docsPath="";
          String xType="";
          String xValues="";
          /************ HERE WE PROCESS THE XDOCS STRING TO KNOW THE TYPE OF DATA **********/
          String[] xArray=xdocs.split("@");
          // Lets get count
          int xCount=xArray.length;
          // NOW LETS CHECK COUNT TO LOOP
          if(xCount>0){
              // We the assign values to each and check
              xType=xArray[0];
              xValues=xArray[1];
              // Now We assign file string to the docsPath
              docsPath=xValues;
              // Now we check Xtype value to assign type appropriately
              if(xType.equals(new String("as"))){
                  // We set type to array String
                  docType="arrayFile";
              }
              else if(xType.equals(new String("of"))){
                 // We set type to normal file
                  docType="normalFile";
              }
          } // End of count check
    
            final Path docDir = Paths.get(docsPath);
            if (!Files.isReadable(docDir)) {
                message+="Document directory '" +docDir.toAbsolutePath()+ "' does not exist or is not readable, please check the path \n";
              System.exit(1);
            }
    
            start = new Date();
            try {
                message+="Indexing to directory '" + indexPath + "'... \n";
    
              Directory dir = FSDirectory.open(Paths.get(indexPath));
              Analyzer analyzer = new StandardAnalyzer();
              IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
    
              if (create) {
                // Create a new index in the directory, removing any
                // previously indexed documents:
                iwc.setOpenMode(OpenMode.CREATE);
              } else {
                // Add new documents to an existing index:
                iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
              }
    
              // Optional: for better indexing performance, if you
              // are indexing many documents, increase the RAM
              // buffer.  But if you do this, increase the max heap
              // size to the JVM (eg add -Xmx512m or -Xmx1g):
              //
              // iwc.setRAMBufferSizeMB(256.0);
    
              writer = new IndexWriter(dir, iwc);
              indexDocs(writer, docDir, xfields);
    
              // NOTE: if you want to maximize search performance,
              // you can optionally call forceMerge here.  This can be
              // a terribly costly operation, so generally it's only
              // worth it when your index is relatively static (ie
              // you're done adding documents to it):
              //
              // writer.forceMerge(1);
    
    
              end = new Date();
              message+=end.getTime() - start.getTime() + " total milliseconds \n";
    
            } catch (IOException e) {
                message+=" caught a " + e.getClass() +
               "\n with message: " + e.getMessage()+" \n";
            }
      }
    
      /** Index all text files under a directory. */
      public void close() throws IOException{ 
          writer.close();
      }
    
      /**
       * Indexes the given file using the given writer, or if a directory is given,
       * recurses over files and directories found under the given directory.
       * 
       * NOTE: This method indexes one document per input file.  This is slow.  For good
       * throughput, put multiple documents into your input file(s).  An example of this is
       * in the benchmark module, which can create "line doc" files, one document per line,
       * using the
       * <a href="../../../../../contrib-benchmark/org/apache/lucene/benchmark/byTask/tasks/WriteLineDocTask.html"
       * >WriteLineDocTask</a>.
       *  
       * @param writer Writer to the index where the given file/dir info will be stored
       * @param path The file to index, or the directory to recurse into to find files to index
       * @throws IOException If there is a low-level I/O error
       * System.out.println(file);
       */
      static void indexDocs(final IndexWriter writer, Path path, String fields) throws IOException {
        if (Files.isDirectory(path)) {
          Files.walkFileTree(path, new SimpleFileVisitor<Path>() {
            @Override
            public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
              try {
                indexDoc(writer, file, fields, attrs.lastModifiedTime().toMillis());
              } catch (IOException ignore) {
                // don't index files that can't be read.
              }
              return FileVisitResult.CONTINUE;
            }
          });
        } else {
          indexDoc(writer, path, fields, Files.getLastModifiedTime(path).toMillis());
        }
      }
    
      /** Indexes a single document */
      static void indexDoc(IndexWriter writer, Path file, String fields, long lastModified) throws IOException {
        try (InputStream stream = Files.newInputStream(file)) {
          // make a new, empty document
          Document doc = new Document();
          // Creating a string array
          String[] contentArray = null;
          String[] prefixArray = null;
          // Array list variable
          List<String> prefixList=new ArrayList<String>();
          List<String> contentList=new ArrayList<String>();
    
          // Other variable parts
          String[] fieldArray;
          String[] fieldValidType={"pdf", "xml", "html"};
          String data="";
          BufferedReader fin = null;
          String fLine="";
    
          // Checking if field is string of a file
          File field=new File(fields);
          String meta="";
          String metaType="";
          String typeVal="";
          String[] metaData;
          String[] typeSplit;
          String ffields="";
    
          // Add the path of the file as a field named "path".  Use a
          // field that is indexed (i.e. searchable), but don't tokenize 
          // the field into separate words and don't index term frequency
          // or positional information:
          Field pathField = new StringField("path", file.toString(), Field.Store.YES);
          doc.add(pathField);
    
          // Add the last modified date of the file a field named "modified".
          // Use a LongPoint that is indexed (i.e. efficiently filterable with
          // PointRangeQuery).  This indexes to milli-second resolution, which
          // is often too fine.  You could instead create a number based on
          // year/month/day/hour/minutes/seconds, down the resolution you require.
          // For example the long value 2011021714 would mean
          // February 17, 2011, 2-3 PM. System.out.println(lastModified);
          doc.add(new LongPoint("modified", lastModified));
    
          // Add the contents of the file to a field named "contents".  Specify a Reader,
          // so that the text of the file is tokenized and indexed, but not stored.
          // Note that FileReader expects the file to be in UTF-8 encoding.
          // If that's not the case searching for special characters will fail.
          // WE READ AND STORE FILE IN DATA BEFORE STORING
          BufferedReader br=new BufferedReader(new InputStreamReader(stream));
          String strLine;
          String contentData="";
          // Now lets loop
          while((strLine=br.readLine())!=null){
              // Now lets now
              contentData+="\n"+strLine;
          }
          // Now lets read line of content
    
          doc.add(new TextField("contents", contentData, Field.Store.YES));
    
          /************ HERE WE TRY TO ADD A UNIQUE FIELDS SENT THROUGH THE XFIELD IF XFIELD IS
           *  NOT NULL AND WE MAKE IT ALL A TEXTFIELD FIELD TYPE
           */
          if(fields!=null){
            // THEN WE ARE TO CREATE DYNAMIC FIELDS
            // Lets process the stream data
              BufferedReader fileData=new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8));
              // LETS CHECK THE DOCTYPE VARIABLE
              if(docType.equals(new String("arrayFile"))){
                  /******** We process as an array file to add fields ******/
                  // Now lets try to convert file data to array again
                  while((data=fileData.readLine())!=null){
                      // NOW LETS READ FILE DATA TO CONVERT TO ARRAY
                      Pattern pat = Pattern.compile("([^<]+)?(<as:(.*?)s>)?");
                      // Calling the matcher
                        Matcher m = pat.matcher(data);
    
                        while (m.find()) {
                            String contents = m.group(1);
                            String prefix = m.group(3);
    
                            if (prefix != null) { prefixList.add(prefix); }
                            if (contents != null) { contentList.add(contents); }
                        } // End of while loop
    
                 /********* NOW LETS COMPOSE INTO AN ARRAY ***************/
                    contentArray=new String[contentList.size()];
                    prefixArray=new String[prefixList.size()];
                    // Now lets compose to array
                    contentArray=contentList.toArray(contentArray);
                    prefixArray=prefixList.toArray(prefixArray);
    
                  } // End of while loop
    
                  /************ NOW WE CAN CREATE DYNAMIC FIELDS *************/
    
                  // Checking
                  if(field.isFile()){
                      // We read the field file to get all the fields
                      fin=Files.newBufferedReader(Paths.get(fields), StandardCharsets.UTF_8);
                      // Now lets get file data line by line
                      fLine=fin.readLine();
                      /******* Now we can process the field data *****/
                      fieldArray=fLine.split(";");
                      // Lets check count
                      if(fieldArray.length>0){
                          // We keep processing
                          meta=fieldArray[0];
                          ffields=fieldArray[1];
                          // Now lets validate the field data file
                          // We get the meta type
                          metaData=meta.split(",");
                          // Now lets get type
                          metaType=metaData[1];
                          // Now lets get the type value fieldValidType
                          typeSplit=metaType.split("-");
                         // NOW LETS CHECK IF TYPE IS IN ARRAY
                          typeVal=typeSplit[1];
    
                          /********* Now lets check if type exists in array **********/
                          if(Arrays.asList(fieldValidType).contains(typeVal)){
                              // ARRAY CONTAINS TYPE SO LETS PROCEED
                              String[] fieldsData=ffields.split(":");
                              // We further split fields data by comma
                              String fDatas=fieldsData[1];
                              // Further split
                              String[] fd=fDatas.split(",");
                              /***** Lets loop field array create the fields ******/
                              if(fd.length>0){
                                for(int i=0; i<=fd.length; i++){
                                    /*********** We do a bit inner loop to check if field matches *********/
                                    for(String prefix:prefixArray){
                                    // Now lets check before we create
                                        if(fd[i]==prefix){ // We create appropriately
                                    // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                                    Field dynamicField = new TextField(fd[i], contentArray[i], Field.Store.YES);
                                    doc.add(dynamicField);
                                        } // End of if
                                    } // End of foreach loop
                                } // End of loop
                              } // End of check
                          }
                          else{
                              // WHEN TYPE DOESNT EXIST WE LOG MESSAGE
                              // Just do nothing here
                          }
    
                      }
    
                  }
                  else{
                      // We assume that field is a string so we process as a string
                      // WE PROCESS FIELD STRING TO GET VALUES
                      int fieldIndex=fields.indexOf("*"); // Setting index value
                      if(fieldIndex>=0){
                      // Now lets split
                      fieldArray=fields.split("\\*");
                      // Lets check count and loop
                      if(fieldArray.length>0){
                          // We loop individual fields
                          for(int i=0; i<=fieldArray.length; i++){
                              // Now lets further process
                              /*********** We do a bit inner loop to check if field matches *********/
                                for(String prefix:prefixArray){
                                // Now lets check before we create
                                    if(fieldArray[i]==prefix){ // We create appropriately
                                // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                                Field dynamicField = new TextField(fieldArray[i], contentArray[i], Field.Store.YES);
                                doc.add(dynamicField);
                                    } // End of if
                                } // End of foreach loop
                          } // End of for loop
                      } // End of count check
                      }
                      else{
                          // Setting a counter
                          int counter=0;
                        // We handle the values straight without loop
                          for(String prefix:prefixArray){
                                // Now lets check before we create
                                if(fields==prefix){ // We create appropriately
                                // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                                Field dynamicField = new TextField(fields, contentArray[counter], Field.Store.YES);
                                doc.add(dynamicField);
                                counter++; // Increment counter
                                    } // End of if
                                } // End of foreach loop  
                      }
                  }
    
              }
              else if(docType.equals(new String("normalFile"))){
                  /******** We process as a normal file to add fields ******/
                  // WE PROCESS FILE TO GET EACH LINES
                // Now lets try to convert file data to array again
                  while((data=fileData.readLine())!=null){
                // We check if there there  
                      fieldAdder(data, doc, fields);
                  } // end of while loop
              }
          }
    
          if (writer.getConfig().getOpenMode() == OpenMode.CREATE) {
            // New index, so we just add the document (no old document can be there):
            // System.out.println("adding " + file);
            writer.addDocument(doc);
          } else {
            // Existing index (an old copy of this document may have been indexed) so 
            // we use updateDocument instead to replace the old one matching the exact 
            // path, if present:
            // System.out.println("updating " + file);
            writer.updateDocument(new Term("path", file.toString()), doc);
          }
        }
      }
    
      /** CREATING A METHOD FOR CREATING DYNAMIC FIELDS **/
      private static void fieldAdder(String fileContent, Document doc, String fields){
        /************* CREATING VARIABLES FOR THIS METHOD *******************/
          try{
          // Other variable parts
          String[] fieldArray;
          String[] fieldValidType={"pdf", "xml", "html"};
          BufferedReader fin = null;
          String fLine="";
    
          // Checking if field is string of a file
          File field=new File(fields);
          String meta="";
          String metaType="";
          String typeVal="";
          String[] metaData;
          String[] typeSplit;
          String ffields="";
          int indexOnContent=0;
    
        // Checking
          if(field.isFile()){
              // We read the field file to get all the fields
              fin=Files.newBufferedReader(Paths.get(fields), StandardCharsets.UTF_8);
              // Now lets get file data line by line
              fLine=fin.readLine();
              /******* Now we can process the field data *****/
              fieldArray=fLine.split(";");
              // Lets check count
              if(fieldArray.length>0){
                  // We keep processing
                  meta=fieldArray[0];
                  ffields=fieldArray[1];
                  // Now lets validate the field data file
                  // We get the meta type
                  metaData=meta.split(",");
                  // Now lets get type
                  metaType=metaData[1];
                  // Now lets get the type value fieldValidType
                  typeSplit=metaType.split("-");
                 // NOW LETS CHECK IF TYPE IS IN ARRAY
                  typeVal=typeSplit[1];
    
                  /********* Now lets check if type exists in array **********/
                  if(Arrays.asList(fieldValidType).contains(typeVal)){
                      // ARRAY CONTAINS TYPE SO LETS PROCEED
                      String[] fieldsData=ffields.split(":");
                      // We further split fields data by comma
                      String fDatas=fieldsData[1];
                      // Further split
                      String[] fd=fDatas.split(",");
                      /***** Lets loop field array create the fields ******/
                      if(fd.length>0){
                        for(int i=0; i<=fd.length; i++){
                            /*********** Check if index exist *********/
                            indexOnContent=fileContent.indexOf(fd[i]);
                            // Now lets check before we create
                                if(indexOnContent>0){ // We create appropriately
                            // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                            Field dynamicField = new TextField(fd[i], fileContent, Field.Store.YES);
                            doc.add(dynamicField);
                                } // End of if
                        } // End of loop
                      } // End of check
                  }
                  else{
                      // WHEN TYPE DOESNT EXIST WE LOG MESSAGE
                      // Just do nothing here
                  }
    
              }
    
          }
          else{
              // We assume that field is a string so we process as a string
              // WE PROCESS FIELD STRING TO GET VALUES
              int fieldIndex=fields.indexOf("*"); // Setting index value
              if(fieldIndex>0){
              // Now lets split
              fieldArray=fields.split("\\*");
              // Lets check count and loop
              if(fieldArray.length>0){
                  // We loop individual fields
                  for(int i=0; i<=((fieldArray.length)-1); i++){
                      // Now lets further process
                      /*********** Check if index exist *********/
                        indexOnContent=fileContent.indexOf(fieldArray[i]);
                        // Now lets check before we create
                        if(indexOnContent>=0){ // We create appropriately
                        // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                        Field dynamicField = new TextField(fieldArray[i], fileContent, Field.Store.YES);
                        doc.add(dynamicField);
                            } // End of if
                  } // End of for loop
              } // End of count check
              }
              else{
                // We handle the values straight without loop
                  indexOnContent=fileContent.indexOf(fields);
                    // Now lets check before we create
                    if(indexOnContent>0){ // We create appropriately
                        // NOW LET US CREATE INDIVIDUAL FIELDS FROM ARRAY LOOP
                        Field dynamicField = new TextField(fields, fileContent, Field.Store.YES);
                        doc.add(dynamicField);
                            } // End of if  
              }
          }
          } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
          // END OF METHOD
      }
    
      // END OF CLASS
    }
    

    索引后,我在索引目录中找到writer.lock文件。我不知道它是否是问题的原因。

    一切似乎都好。我只是不知道零点击的原因是什么。

1 个答案:

答案 0 :(得分:1)

  1. 错误消息显示代码中出现NullPointerException的行。而且你是唯一拥有完整代码和行号的人......
  2. 然而,很明显您忘记在方法中初始化数组searchFetched
  3. 我不明白为什么要创建包含相同对象的新数组
  4. 在Java中,您可以使用clone()克隆数组或将其复制Arrays.copyOf(T[], int)