ArrayIndexOutOfBounds将文档添加到Lucene Index时的异常(v 5.1.0)

时间:2015-07-09 16:41:57

标签: lucene

在向索引添加几个文档后,我不断收到ArrayIndexOutOfBounds异常。我有一个线程,可以将文档添加到多台机器的索引中。 我不确定这是否与我如何向Document添加字段有关。 "文字"可以是任何长度。

public static void CreateIndex(){
    int count = 0;
    try {
        IndexWriters.initIndexWriters(); //these are located on different machines on the network

        int whichIndex = 0;

        FileInputStream fis = new FileInputStream(Constants.MY_XML);
        BufferedReader t = new BufferedReader(new InputStreamReader(fis, "UTF-8"));

        String title = "";
        String articleId = "";
        String str;

        while ((str = t.readLine()) != null)
        {
                title = str.replace("<title>", "").replace("</title>", "").trim();
                str = t.readLine();
                articleId = str.replace("<id>", "").replace("</id>", "").trim();
                str = t.readLine();
                text = str.replace("<text>", "").replace("</text>", "").trim();

                Document lDoc = new Document();

                Field id = new TextField("id", articleId, Field.Store.YES);
                Field title = new TextField("title", title, Field.Store.YES);
                Field text = new TextField("text", text.toString(), Field.Store.YES);
                Doc.add(id);
                Doc.add(title);
                Doc.add(text);

                IndexWriters.indW[whichIndex].addDocument(Doc);  //exception thrown here after adding couple of documents

                whichIndex = (whichIndex+1)%(Constants.NUMNODES*Constants.PER_NODE_INDEXES_COUNT);
                ++count;
                }
                articleId = "";
                text = null;
                title = "";
            }
            }
        }
        IndexWriters.closeWriters();

    }catch (IOException e) {
        e.printStackTrace();
    }catch( Exception e){
        e.printStackTrace();
    }
}

堆栈跟踪如下

java.lang.ArrayIndexOutOfBoundsException: 32768
at org.apache.lucene.util.StringHelper.murmurhash3_x86_32(StringHelper.java:188)
at org.apache.lucene.util.BytesRefHash.doHash(BytesRefHash.java:460)
at org.apache.lucene.util.BytesRefHash.rehash(BytesRefHash.java:432)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:323)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:154)
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:657)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:458)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1350)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1138)

1 个答案:

答案 0 :(得分:0)

嗯,您的代码中的假设是:

Constants.NUMNODES*Constants.PER_NODE_INDEXES_COUNT等于(或小于)IndexWriters.indW.length

假设假设不正确。

(这些也是一堆无关的},但如果它编译,我会假设当你在此编辑发布时,这只是遗留下来的事情)< / p>