Question

我最近需要打开＆amp;关闭Elasticsearch索引以添加自定义分析器＆amp;创建一个映射。从那时起，我一直在所有节点上看到磁盘空间问题，并且不确定最佳方法。

以下是当天主节点上日志文件中的一些选定行：

[2016-03-18 01:54:46,161][INFO ][cluster.metadata] [instance name] closing indices [[prod]]
[2016-03-18 01:54:46,161][INFO ][cluster.metadata] [instance name] opening indices [[prod]]
[2016-03-18 01:54:48,493][WARN ][cluster.routing.allocation.decider] [instance name] After allocating, node [nodename] would have less than the required 0b free bytes threshold (-28916726190 bytes free), preventing allocation
[2016-03-18 01:54:48,494][WARN ][cluster.routing.allocation.decider] [instance name] After allocating, node [nodename] would have less than the required 0b free bytes threshold  (-29217364398 bytes free), preventing allocation

...每个ES节点的其中一行......

然后，每个节点看起来像其中一个：

[2016-03-18 01:54:49,500][DEBUG][action.search.type] [instance name] All shards failed for phase: [query]
org.elasticsearch.transport.RemoteTransportException: instance name]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.index.shard.IllegalIndexShardStateException: [prod][0] CurrentState[RECOVERING] operations only allowed when started/relocated
    at org.elasticsearch.index.shard.IndexShard.readAllowed(IndexShard.java:1000)
    at org.elasticsearch.index.shard.IndexShard.acquireSearcher(IndexShard.java:793)
    at org.elasticsearch.index.shard.IndexShard.acquireSearcher(IndexShard.java:789)
    at org.elasticsearch.search.SearchService.createContext(SearchService.java:552)
    at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:532)
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:294)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
    at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

然后，更多的第一个块引用行，然后：

[2016-03-18 01:56:39,891][INFO ][cluster.metadata] [instance name] [prod] create_mapping [mapping name]

然后其中两行：

[2016-03-18 02:05:18,993][WARN ][cluster.action.shard] [instance name] [prod][1] received shard failed for [prod][1], node[node name], [R], s[INITIALIZING], indexUUID [index id], reason [shard failure [failed recovery][RecoveryFailedException[[prod][1]: Recovery failed from [instance name]{master=true} into [instance name]{master=false}]; nested: RemoteTransportException[[instance name][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[prod][1] Phase[1] Execution failed]; nested: RecoverFilesRecoveryException[[prod][1] Failed to transfer [0] files with total size of [0b]]; nested: IllegalStateException[try to recover [prod][1] from primary shard with sync id but number of docs differ: 217250828 (instance name, primary) vs 217250830(instance name)]; ]]

然后，一堆低磁盘然后高磁盘水印错误。自打开/关闭命令运行以来，这些错误一直在发生，因此它们阻止了新数据的索引。

当我运行/ cat / _shards / prod时，我看到：

index shard prirep state             docs  store ip          node                                    
prod  0     p      STARTED      218452373 73.5gb 
prod  0     r      STARTED      218452373 73.5gb 
prod  0     r      UNASSIGNED                                                                        
prod  1     p      STARTED      217445482 73.1gb 
prod  1     r      STARTED      217445482 73.1gb 
prod  1     r      UNASSIGNED                                                                        
prod  2     r      INITIALIZING                  
prod  2     r      INITIALIZING                 
prod  2     p      STARTED      218665090 73.2gb

并注意到碎片2的一个副本碎片在INITIALIZING和UNASSIGNED阶段之间振荡。

我真的希望有人能就最佳前进方向进行咨询，因为这个问题每天都会变得更加痛苦。我现在能想到的最好的方法是备份所有数据，更新索引设置以获得0个副本（摆脱未分配的分片），然后更新以添加1个副本（因为我感觉可能是恢复过程无意中添加了副本）。我无法弄清楚如何确认这个理论，我能辨别的最多的是我们没有覆盖elasticsearch.yml中的默认设置（默认值是1个副本），而且我们的ec2实例大小似乎不大能够在同一个实例上保存2个分片。我真的想知道是否有人对如何以及为何开启/关闭和索引导致磁盘使用率激增有任何想法。 ElasticSearch文档提到关闭索引可能导致这种情况，但他们没有提供太多其他背景知识。

如果有帮助，请尽快提供任何其他信息，谢谢（这么多）提前！

关闭/打开索引后的弹性搜索磁盘空间问题

0 个答案: