批量加载期间的Solr错误 - org.apache.lucene.index.MergePolicy $ MergeException

时间:2014-07-29 16:45:40

标签: solr datastax-enterprise datastax

我通过sstableloader批量加载数百万条记录时遇到了很多以下异常:

ERROR [Lucene Merge Thread #132642] 2014-07-29 00:35:01,252 CassandraDaemon.java (line 199) Exception in thread Thread[Lucene Merge Thread #132642,6,main]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IllegalStateException: failed
        at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.IllegalStateException: failed
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:93)
        at org.apache.lucene.util.packed.BlockPackedReader.get(BlockPackedReader.java:86)
        at org.apache.lucene.util.LongValues.get(LongValues.java:35)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer$5.getOrd(Lucene45DocValuesProducer.java:459)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.setNext(DocValuesConsumer.java:389)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.hasNext(DocValuesConsumer.java:352)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addNumericField(Lucene45DocValuesConsumer.java:141)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addSortedField(Lucene45DocValuesConsumer.java:350)
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedField(PerFieldDocValuesFormat.java:116)
        at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedField(DocValuesConsumer.java:305)
        at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:197)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4058)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Caused by: java.io.EOFException: Read past EOF (resource: BBIndexInput(name=_13ms5_Lucene45_0.dvd))
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:188)
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:129)
        at org.apache.lucene.store.DataInput.readShort(DataInput.java:77)
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.readShort(ByteBufferIndexInput.java:89)
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:64)
        ... 15 more

我从异常跟踪中看到它与Long值和EOF有关。但是,我不知道是什么触发了错误。我尝试导入的SSTable文件是由使用 org.apache.cassandra.io.sstable.CQLSSTableWriter 的Java程序(由我编写)生成的。

可以在此处找到CF模式,Solr模式和SSTable生成器代码:https://www.dropbox.com/sh/1rpo3ixmz1bg9y2/AAA3aqlfzWEsNIwy79G9dASba

PS:

  • 我最近从DSE 4.1.3升级到4.5.1。我不记得在升级之前看到此错误
  • 生成器类路径中包含的cassandra库是2.0.8版。在DSE升级之前,它使用版本2.0.5库
  • DSE拓扑:1个DC,6个solr节点(禁用vnode),RF 2
  • 其他DSE配置:LeveledCompaction,LZ4压缩,GossipingPropertyFileSnitch
  • 机器规格:CentOS 6.5 x64,JDK 1.7.0_55,hexacore,堆大小120gb(我们有特定的查询需要这个),128gb总ram

我最初在6个节点中的3个中遇到了错误。我重新启动了所有这些,我能够导入超过150万条记录而没有错误。但是,当我在睡觉时无人看管地离开导入时,错误在6个节点中的1个中重新出现。

我现在非常惊慌,因为每个节点中的索引记录数(根据Solr管理员UI)与Cassandra行数相比减少了大约60,000条记录(根据nodetool cfstats)

更新

仍然继续体验这一点。索引文档(Solr)和存储文档(Cassandra cfstats)的数量之间的差异日益扩大

更新(2014-08-13):

按照Rock Brain的建议更改了目录工厂;但是在通过sstableloader持续导入的几个小时内重新发生错误

更新(2014-08-14):

有趣的是,我注意到我实际上得到了两个类似的异常(差异只是最后一个&#34的堆栈跟踪;由#34引起:

例外1:

ERROR [Lucene Merge Thread #24937] 2014-08-14 06:20:32,270 CassandraDaemon.java (line 199) Exception in thread Thread[Lucene Merge Thread #24937,6,main]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IllegalStateException: failed
        at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.IllegalStateException: failed
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:93)
        at org.apache.lucene.util.packed.BlockPackedReader.get(BlockPackedReader.java:86)
        at org.apache.lucene.util.LongValues.get(LongValues.java:35)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer$5.getOrd(Lucene45DocValuesProducer.java:459)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.setNext(DocValuesConsumer.java:389)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.hasNext(DocValuesConsumer.java:352)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addNumericField(Lucene45DocValuesConsumer.java:141)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addSortedField(Lucene45DocValuesConsumer.java:350)
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedField(PerFieldDocValuesFormat.java:116)
        at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedField(DocValuesConsumer.java:305)
        at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:197)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4058)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Caused by: java.io.EOFException: Read past EOF (resource: BBIndexInput(name=_67nex_Lucene45_0.dvd))
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:188)
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:129)
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:64)
        ... 15 more

异常2(与本文顶部的原始异常完全相同):

ERROR [Lucene Merge Thread #24936] 2014-08-14 06:20:34,694 CassandraDaemon.java (line 199) Exception in thread Thread[Lucene Merge Thread #24936,6,main]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IllegalStateException: failed
        at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.IllegalStateException: failed
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:93)
        at org.apache.lucene.util.packed.BlockPackedReader.get(BlockPackedReader.java:86)
        at org.apache.lucene.util.LongValues.get(LongValues.java:35)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer$5.getOrd(Lucene45DocValuesProducer.java:459)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.setNext(DocValuesConsumer.java:389)
        at org.apache.lucene.codecs.DocValuesConsumer$4$1.hasNext(DocValuesConsumer.java:352)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addNumericField(Lucene45DocValuesConsumer.java:141)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesConsumer.addSortedField(Lucene45DocValuesConsumer.java:350)
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedField(PerFieldDocValuesFormat.java:116)
        at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedField(DocValuesConsumer.java:305)
        at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:197)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4058)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Caused by: java.io.EOFException: Read past EOF (resource: BBIndexInput(name=_67fvk_Lucene45_0.dvd))
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:188)
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:129)
        at org.apache.lucene.store.DataInput.readShort(DataInput.java:77)
        at com.datastax.bdp.search.lucene.store.bytebuffer.ByteBufferIndexInput.readShort(ByteBufferIndexInput.java:89)
        at org.apache.lucene.util.packed.DirectPackedReader.get(DirectPackedReader.java:64)
        ... 15 more

更新第2部分(2014-08-14):

示例RELOAD警告:

 WARN [http-8983-2] 2014-08-14 08:31:28,828 CassandraCoreContainer.java (line 739) Too much waiting for new searcher...
 WARN [http-8983-2] 2014-08-14 08:31:28,831 SolrCores.java (line 375) Tried to remove core myks.mycf from pendingCoreOps and it wasn't there.
 INFO [http-8983-2] 2014-08-14 08:31:28,832 StorageService.java (line 2644) Starting repair command #3, repairing 0 ranges for keyspace solr_admin
 INFO [http-8983-2] 2014-08-14 08:31:28,835 SolrDispatchFilter.java (line 672) [admin] webapp=null path=/admin/cores params={slave=true&deleteAll=false&name=myks.mycf&distributed=false&action=RELOAD&reindex=false&core=myks.mycf&wt=javabin&version=2} status=0 QTime=61640

更新(2014-08-23):

重新执行建议的解决方法后,我无法重现异常

1 个答案:

答案 0 :(得分:2)

更新所有核心的solrconfig.xml:将directoryFactorycom.datastax.bdp.cassandra.index.solr.DSENRTCachingDirectoryFactory交换为solr.MMapDirectoryFactory

此外,正在使用的操作系统,JVM版本,CPU数量,堆大小,总可用内存。加载错误发生了多少分钟/小时。