ElasticSearch的索引数量很大会导致OOM

时间:2014-05-01 23:37:22

标签: python indexing elasticsearch

我一直在努力解决这个问题,现在我决定就这个话题提出一个问题,寻求一些建议。

我正在测试环境,以便随着时间的推移索引大量数据。基本上我每天都会从各种网站索引日志和相关文档。

我希望每个网站都有一个索引,以便更好地进行逻辑划分,并能够过滤查询并获得更快的响应。

我预计每个索引每天的流量约为1Gb。

我在使用2x80Gb SSD的AWS实例上测试环境,CPU供应商:Intel,CPU型号:Xeon(2500 MHz),CPU总逻辑核心数:8

我有1个节点,x个索引和每个索引1个分片(用于测试环境)并推送从维基百科文章中获取的文档,随机分发200个或更多索引。

我正在将文档编入索引,单个客户端的速率约为200 / s。这仅用于测试。当我能够解决OOM问题时,我将增加客户端数量和节点数量。

我尝试过500k文档和200个索引,这很好,如果我尝试使用500k和300个索引,它会抛出讨厌的OOM错误异常。此外,1m文档和100个索引会抛出OOM错误。 我现在正在运行1m文档和200个索引的测试。

我尝试更改分片编号,我也尝试使用这些参数 index.merge.policy.max_merged_segment:2g  index.merge.policy.segments_per_tier:[我试过5次(并减少了最大数量) - 25]  index.merge.policy.max_merge_at_once:8

我的设置是(使用/ _nodes / settings,os,process,jvm?pretty命令检索)

{ 
  "cluster_name" : "elasticsearch", 
  "nodes" : { 
    "QBk8YzISQPu-VVMnKvhEmQ" : { 
      "name" : "", 
      "transport_address" : "", 
      "host" : "", 
      "ip" : "", 
      "version" : "1.1.0", 
      "build" : "2181e11", 
      "http_address" : "", 
      "settings" : { 
        "path" : { 
          "data" : "/mnt/db/se_data/elasticsearch/", 
          "logs" : "/mnt/db/searchengines/elasticsearch-1.1.0/logs", 
          "home" : "/mnt/db/searchengines/elasticsearch-1.1.0" 
        }, 
        "cluster" : { 
          "name" : "elasticsearch" 
        }, 
        "index" : { 
          "number_of_shards" : "1" 
        }, 
        "foreground" : "yes", 
        "name" : "", 
        "max-open-files" : "true" 
      }, 
      "os" : { 
        "refresh_interval" : 1000, 
        "available_processors" : 8, 
        "cpu" : { 
          "vendor" : "Intel", 
          "model" : "Xeon", 
          "mhz" : 2500, 
          "total_cores" : 8, 
          "total_sockets" : 8, 
          "cores_per_socket" : 32, 
          "cache_size_in_bytes" : 25600 
        }, 
        "mem" : { 
          "total_in_bytes" : 31502180352 
        }, 
        "swap" : { 
          "total_in_bytes" : 3071995904 
        } 
      }, 
      "process" : { 
        "refresh_interval" : 1000, 
        "id" : 20155, 
        "max_file_descriptors" : 64000, 
        "mlockall" : false 
      }, 
      "jvm" : { 
        "pid" : 20155, 
        "version" : "1.7.0_51", 
        "vm_name" : "Java HotSpot(TM) 64-Bit Server VM", 
        "vm_version" : "24.51-b03", 
        "vm_vendor" : "Oracle Corporation", 
        "start_time" : 1398962098305, 
        "mem" : { 
          "heap_init_in_bytes" : 268435456, 
          "heap_max_in_bytes" : 10667687936, 
          "non_heap_init_in_bytes" : 24313856, 
          "non_heap_max_in_bytes" : 136314880, 
          "direct_max_in_bytes" : 10667687936 
        }, 
        "gc_collectors" : [ "ParNew", "ConcurrentMarkSweep" ], 
        "memory_pools" : [ "Code Cache", "Par Eden Space", "Par Survivor Space", "CMS Old Gen", "CMS Perm Gen" ] 
      } 
    } 
  } 
} 

以下是JVM设置: -Xms256m -Xmx10g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Delasticsearch -Des.foreground=yes

文件打开的最大数量设置为65k(这是我之前的问题..允许的打开文件数量仅为4k)

使用bigdesk检查内存消耗,我看到提交/使用的堆内存在到达崩溃点时不断增长。我认为这会引起OOM错误,即使它消耗达到〜4-6Gb时朝着最大分配的10Gb的Java堆崩溃。

可能是GC问题吗?我应该按照本文中的设置进行调整吗?

如果我在任何索引上插入一个flush()命令,例如10k插入的文档,它是否有助于减少内存使用量?

索引数量是否过高?我应该改变方法吗?将来自不同网站的更多日志放在单个索引中似乎是最合理的解决方案,是吗?

PS:我正在使用python脚本来测试环境,使用了python模块的elasticsearch(我应该使用pyelasticsearch吗?它应该只是一个功能更丰富的模块,对吧?)。

如果您需要更多信息,请与我们联系。 感谢您的时间!

编辑: ES日志文件中的堆栈跟踪如下 - 尝试在300索引上加载500k文档。

[2014-05-01 00:16:49,526][INFO ][node                     ] [Captain Wings] stopping ...
[2014-05-01 00:16:49,767][WARN ][index.shard.service      ] [Captain Wings] [index_220][0] Failed to perform scheduled engine refresh
org.elasticsearch.index.engine.RefreshFailedEngineException: [index_220][0] Refresh failed
    at org.elasticsearch.index.engine.internal.InternalEngine.refresh(InternalEngine.java:725)
    at org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:469)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:920)
    [...]
    at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.close(CompressingStoredFieldsWriter.java:138)
    [...]
    ... 5 more
    Suppressed: java.io.FileNotFoundException: _75.fdx
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.codecs.compressing.CompressingStoredFieldsIndexWriter.close(CompressingStoredFieldsIndexWriter.java:205)
        ... 24 more
[2014-05-01 00:16:49,804][WARN ][index.merge.scheduler    ] [Captain Wings] [index_65][0] failed to merge
java.io.FileNotFoundException: _7n_es090_0.tim
    at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
    at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
    at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
    at org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:81)
    at org.apache.lucene.codecs.BlockTreeTermsWriter.close(BlockTreeTermsWriter.java:1140)
    at org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsConsumer.close(BloomFilterPostingsFormat.java:371)
    at org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat$1.close(Elasticsearch090PostingsFormat.java:61)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsConsumerAndSuffix.close(PerFieldPostingsFormat.java:86)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:163)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.close(PerFieldPostingsFormat.java:154)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4119)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3716)
    at org.apache.lucene.index.TrackingSerialMergeScheduler.merge(TrackingSerialMergeScheduler.java:122)
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:89)
    at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:71)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1936)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1930)
    at org.elasticsearch.index.merge.Merges.maybeMerge(Merges.java:47)
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:926)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger$1.run(InternalIndexShard.java:966)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.tip
        ... 26 more
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.doc
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
        at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.close(Lucene41PostingsWriter.java:587)
        ... 23 more
        Suppressed: java.io.FileNotFoundException: _7n_es090_0.pos
            ... 28 more
[2014-05-01 00:16:49,807][WARN ][index.engine.internal    ] [Captain Wings] [index_65][0] failed engine
org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: _7n_es090_0.tim
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:92)
    at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:71)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1936)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1930)
    at org.elasticsearch.index.merge.Merges.maybeMerge(Merges.java:47)
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:926)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger$1.run(InternalIndexShard.java:966)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: _7n_es090_0.tim
    at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
    at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
    at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
    at org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:81)
    at org.apache.lucene.codecs.BlockTreeTermsWriter.close(BlockTreeTermsWriter.java:1140)
    at org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsConsumer.close(BloomFilterPostingsFormat.java:371)
    at org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat$1.close(Elasticsearch090PostingsFormat.java:61)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsConsumerAndSuffix.close(PerFieldPostingsFormat.java:86)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:163)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.close(PerFieldPostingsFormat.java:154)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4119)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3716)
    at org.apache.lucene.index.TrackingSerialMergeScheduler.merge(TrackingSerialMergeScheduler.java:122)
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:89)
    ... 9 more
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.tip
        ... 26 more
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.doc
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
        at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.close(Lucene41PostingsWriter.java:587)
        ... 23 more
        Suppressed: java.io.FileNotFoundException: _7n_es090_0.pos
            ... 28 more
[2014-05-01 00:16:49,807][WARN ][index.shard.service      ] [Captain Wings] [index_65][0] Failed to perform scheduled engine optimize/merge
org.elasticsearch.index.engine.OptimizeFailedEngineException: [index_65][0] Optimize failed
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:936)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger$1.run(InternalIndexShard.java:966)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: _7n_es090_0.tim
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:93)
    at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:71)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1936)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1930)
    at org.elasticsearch.index.merge.Merges.maybeMerge(Merges.java:47)
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:926)
    ... 4 more
Caused by: java.io.FileNotFoundException: _7n_es090_0.tim
    at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
    at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
    at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
    at org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:81)
    at org.apache.lucene.codecs.BlockTreeTermsWriter.close(BlockTreeTermsWriter.java:1140)
    at org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsConsumer.close(BloomFilterPostingsFormat.java:371)
    at org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat$1.close(Elasticsearch090PostingsFormat.java:61)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsConsumerAndSuffix.close(PerFieldPostingsFormat.java:86)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:163)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.close(PerFieldPostingsFormat.java:154)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4119)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3716)
    at org.apache.lucene.index.TrackingSerialMergeScheduler.merge(TrackingSerialMergeScheduler.java:122)
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:89)
    ... 9 more
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.tip
        ... 26 more
    Suppressed: java.io.FileNotFoundException: _7n_es090_0.doc
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
        at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.close(Lucene41PostingsWriter.java:587)
        ... 23 more
        Suppressed: java.io.FileNotFoundException: _7n_es090_0.pos
            ... 28 more
[2014-05-01 00:16:49,913][WARN ][cluster.action.shard     ] [Captain Wings] [index_65][0] sending failed shard for [index_65][0], node[Fj243PXdSNGtSm_jLNl9hQ], [P], s[STARTED], indexUUID [v6ZQJKpTRbiZZ5e6Q6ENyw], reason [engine failure, message [MergeException[java.io.FileNotFoundException: _7n_es090_0.tim]; nested: FileNotFoundException[_7n_es090_0.tim]; ]]
[2014-05-01 00:16:49,913][WARN ][cluster.action.shard     ] [Captain Wings] [index_65][0] received shard failed for [index_65][0], node[Fj243PXdSNGtSm_jLNl9hQ], [P], s[STARTED], indexUUID [v6ZQJKpTRbiZZ5e6Q6ENyw], reason [engine failure, message [MergeException[java.io.FileNotFoundException: _7n_es090_0.tim]; nested: FileNotFoundException[_7n_es090_0.tim]; ]]
[2014-05-01 00:17:03,453][WARN ][index.merge.scheduler    ] [Captain Wings] [index_221][0] failed to merge
java.io.FileNotFoundException: _7e_es090_0.tim
    at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
    at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
    at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
    at org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:81)
    at org.apache.lucene.codecs.BlockTreeTermsWriter.close(BlockTreeTermsWriter.java:1140)
    at org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsConsumer.close(BloomFilterPostingsFormat.java:371)
    at org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat$1.close(Elasticsearch090PostingsFormat.java:61)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsConsumerAndSuffix.close(PerFieldPostingsFormat.java:86)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:163)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.close(PerFieldPostingsFormat.java:154)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4119)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3716)
    at org.apache.lucene.index.TrackingSerialMergeScheduler.merge(TrackingSerialMergeScheduler.java:122)
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:89)
    at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:71)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1936)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1930)
    at org.elasticsearch.index.merge.Merges.maybeMerge(Merges.java:47)
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:926)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger$1.run(InternalIndexShard.java:966)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
    Suppressed: java.io.FileNotFoundException: _7e_es090_0.tip
        ... 26 more
    Suppressed: java.io.FileNotFoundException: _7e_es090_0.doc
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
        at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.close(Lucene41PostingsWriter.java:587)
        ... 23 more
        Suppressed: java.io.FileNotFoundException: _7e_es090_0.pos
            ... 28 more
[2014-05-01 00:17:03,454][WARN ][index.engine.internal    ] [Captain Wings] [index_221][0] failed engine
org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: _7e_es090_0.tim
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:92)
    at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:71)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1936)
    at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1930)
    at org.elasticsearch.index.merge.Merges.maybeMerge(Merges.java:47)
    at org.elasticsearch.index.engine.internal.InternalEngine.maybeMerge(InternalEngine.java:926)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger$1.run(InternalIndexShard.java:966)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: _7e_es090_0.tim
    at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
    at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
    at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
    at org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:81)
    at org.apache.lucene.codecs.BlockTreeTermsWriter.close(BlockTreeTermsWriter.java:1140)
    at org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsConsumer.close(BloomFilterPostingsFormat.java:371)
    at org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat$1.close(Elasticsearch090PostingsFormat.java:61)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsConsumerAndSuffix.close(PerFieldPostingsFormat.java:86)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:163)
    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.close(PerFieldPostingsFormat.java:154)
    at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4119)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3716)
    at org.apache.lucene.index.TrackingSerialMergeScheduler.merge(TrackingSerialMergeScheduler.java:122)
    at org.elasticsearch.index.merge.scheduler.SerialMergeSchedulerProvider$CustomSerialMergeScheduler.merge(SerialMergeSchedulerProvider.java:89)
    ... 9 more
    Suppressed: java.io.FileNotFoundException: _7e_es090_0.tip
        ... 26 more
    Suppressed: java.io.FileNotFoundException: _7e_es090_0.doc
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
        at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
        at org.elasticsearch.index.store.Store$StoreIndexOutput.close(Store.java:611)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
        at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.close(Lucene41PostingsWriter.java:587)
        ... 23 more
        Suppressed: java.io.FileNotFoundException: _7e_es090_0.pos
            ... 28 more
[2014-05-01 00:17:06,961][INFO ][node                     ] [Captain Wings] stopped
[2014-05-01 00:17:06,961][INFO ][node                     ] [Captain Wings] closing ...
[2014-05-01 00:17:06,966][INFO ][node                     ] [Captain Wings] closed

编辑:另外一次运行,在100个索引上存储300万个文档。以下是ES崩溃时BigDesk的截图: BigDesk Capture

这是堆栈跟踪(使用外部回购,因为它太大而无法粘贴) http://m.uploadedit.com/b034/1399053538319.txt

它停在~500k文档,我正在检查为该过程打开的文件描述符,它们是< 5000.JVM -Xmx的堆内存是20gb和5个分片......

无论如何,在1个索引中存储3m文档就没问题了。

编辑: ulimit -a输出:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 240150
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

/etc/security/limits.conf文件中的文件数已更改为64000.此处不显示。我正在监视/ proc // fd /文件夹中打开文件的数量,它超过了1024(我认为100个索引大约是6k)但远低于64k。

编辑: 崩溃前http://54.227.158.137:9200/_nodes/stats/indices?pretty的结果: { "cluster_name" : "elasticsearch", "nodes" : { "t5FjNo1xQbCk97Qqv0w2sQ" : { "timestamp" : 1399395162423, "name" : "Dementia", "transport_address" : "inet[/10.69.21.196:9300]", "host" : "ip-10-69-21-196", "ip" : [ "inet[/10.69.21.196:9300]", "NONE" ], "indices" : { "docs" : { "count" : 433533, "deleted" : 0 }, "store" : { "size_in_bytes" : 10837028821, "throttle_time_in_millis" : 1971123 }, "indexing" : { "index_total" : 435229, "index_time_in_millis" : 1651288, "index_current" : 0, "delete_total" : 0, "delete_time_in_millis" : 0, "delete_current" : 0 }, [...] "merges" : { "current" : 4, "current_docs" : 574, "current_size_in_bytes" : 10968915, "total" : 13935, "total_time_in_millis" : 9056598, "total_docs" : 635889, "total_size_in_bytes" : 15384667490 }, "refresh" : { "total" : 161452, "total_time_in_millis" : 9013814 }, "flush" : { "total" : 1000, "total_time_in_millis" : 730342 }, "warmer" : { "current" : 0, "total" : 176035, "total_time_in_millis" : 18410 }, "filter_cache" : { "memory_size_in_bytes" : 0, "evictions" : 0 }, "id_cache" : { "memory_size_in_bytes" : 0 }, "fielddata" : { "memory_size_in_bytes" : 0, "evictions" : 0 }, "percolate" : { "total" : 0, "time_in_millis" : 0, "current" : 0, "memory_size_in_bytes" : -1, "memory_size" : "-1b", "queries" : 0 }, "completion" : { "size_in_bytes" : 0 }, "segments" : { "count" : 36091, "memory_in_bytes" : 76230702 }, "translog" : { "operations" : 80160, "size_in_bytes" : 1070283 } } } } }

1 个答案:

答案 0 :(得分:2)

您是否调整过现场数据缓存?默认为无限制,并且使用这么多索引,内存耗尽就不足为奇了。 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html

更新,这显然解决了这个问题:

OOM正在发生内存映射文件。此类型的内存在常规JVM堆之外分配,并且也不在垃圾收集的范围内。看一下这个帖子:groups.google.com/forum/#!topic/elasticsearch/4Nj_HUl78KA了解一些可行的方法。显然,Linux默认情况下会限制可用内存,您应该可以删除此限制。