在Elasticsearch中设置刷新间隔以改善io-wait?

时间:2013-07-03 14:01:09

标签: elasticsearch

我的群集显示了很多io-wait(大约50%)。

我做了很多索引和重建索引。

我认为lucene的重新索引可能是IO的原因。想到可能增加refresh_interval或者index.translog选项 - 这是正确的方法吗?

我的主要问题是我不知道如何找出我的设置。

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings/中,它列出了很多选项,当我使用时,这些选项都不可用:

curl -xget 'http://localhost:9200/my_index/_settings'

如果使用默认值,则不返回值(根据kimchy在this post上的答案)

我只获得了明确设置的分片数,副本数。 elasticsearch.yml文件不会告诉默认值是什么。我怎么知道我的变化发生了,现在有什么价值?

非常感谢,因为我无法找到相关文档。

运行hot_threads,我得到了:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=5'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

   50.6% (253.2ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#20]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   32.9% (164.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#12]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   29.1% (145.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#8]'
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:131)
       org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:533)
       org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:133)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:609)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   26.5% (132.7ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#11]'
     2/10 snapshots sharing following 15 elements
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

    4.2% (21.1ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][bulk][T#4]'
     10/10 snapshots sharing following 9 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:706)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.xfer(LinkedTransferQueue.java:615)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.take(LinkedTransferQueue.java:1109)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

使用阻止运行并等待:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=wait'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) wait usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) wait usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) wait usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=block'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) block usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) block usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) block usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot

1 个答案:

答案 0 :(得分:15)

默认情况下,index.refresh_interval设置为1秒。您可以通过将其设置为-1来增加此间隔或禁用自动刷新。

curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
    "index" : {
        "refresh_interval" : -1
    }
}
'

但是,在开始搞乱设置之前,我建议找出这种高I / O的实际原因。运行hot_threads请求并检查线程在大多数情况下花费的时间。