带有NEST的Elasticsearch批量插入返回es_rejected_execution_exception

时间:2017-04-18 08:02:54

标签: elasticsearch bulkinsert nest

我正在尝试使用.Net API中的Elasticsearch进行批量插入,这是我在执行操作时遇到的错误;

Error   {Type: es_rejected_execution_exception Reason: "rejected execution of org.elasticsearch.transport.TransportService$6@604b47a4 on EsThreadPoolExecutor[bulk, queue capacity = 50, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@51f4f734[Running, pool size = 4, active threads = 4, queued tasks = 50, completed tasks = 164]]" CausedBy: ""}   Nest.BulkError

是否由于我的系统空间不足或批量插入功能本身无效?我的NEST版本为5.0Elasticsearch版本也为5.0

批量插入逻辑的代码;

public void bulkInsert(List<BaseData> recordList, List<String> listOfIndexName) {
    BulkDescriptor descriptor = new BulkDescriptor();            
    foreach (var j in Enumerable.Range(0, recordList.Count)) {
        descriptor.Index<BaseData>(op => op.Document(recordList[j])
                                           .Index(listOfIndexName[j]));
    }
    var result = clientConnection.Bulk(descriptor);
}

1 个答案:

答案 0 :(得分:5)

正如Val在评论中所说,您可能一次发送的数据超出了群集可以处理的数据。您似乎可能尝试在一个批量请求中发送所有您的文档,这对于许多文档或大型文档可能无效。

使用_bulk,您需要在多个批量请求中将数据发送到群集,并找到您可以在每个批量请求中发送的最佳数量的文档,此外还有可以同时发送到群集的批量请求数。

此处没有适用于最佳大小的硬性规则,因为它可能会因文档的复杂程度,分析方式,群集硬件,群集设置,索引设置等而异。

最好的办法是从一个合理的数字开始,比如在一个请求中说500个文档(或者在你的上下文中有意义的一些数字),然后从那里开始。计算每个批量请求的总大小(以字节为单位)也是一种很好的方法。如果性能和吞吐量不足,则增加文档数,请求字节大小或并发请求,直到您开始看到es_rejected_execution_exception

NEST 5.x附带一个方便的助手,使用IObservable<T>和Observable设计模式

使批量请求变得更加容易
void Main()
{
    var client = new ElasticClient();

    // can cancel the operation by calling .Cancel() on this
    var cancellationTokenSource = new CancellationTokenSource();

    // set up the bulk all observable
    var bulkAllObservable = client.BulkAll(GetDocuments(), ba => ba
        // number of concurrent requests
        .MaxDegreeOfParallelism(8)
        // in case of 429 response, how long we should wait before retrying
        .BackOffTime(TimeSpan.FromSeconds(5))
        // in case of 429 response, how many times to retry before failing
        .BackOffRetries(2)
        // number of documents to send in each request
        .Size(500)
        .Index("index-name")
        .RefreshOnCompleted(),
        cancellationTokenSource.Token
    );

    var waitHandle = new ManualResetEvent(false);
    Exception ex = null;

    // what to do on each call, when an exception is thrown, and 
    // when the bulk all completes
    var bulkAllObserver = new BulkAllObserver(
        onNext: bulkAllResponse =>
        {
            // do something after each bulk request
        },
        onError: exception =>
        {
            // do something with exception thrown
            ex = exception;
            waitHandle.Set();
        },
        onCompleted: () =>
        {
            // do something when all bulk operations complete
            waitHandle.Set();
        });

    bulkAllObservable.Subscribe(bulkAllObserver);

    // wait for handle to be set.
    waitHandle.WaitOne();

    if (ex != null)
    {
        throw ex;
    }
}

// Getting documents should be lazily enumerated collection ideally
public static IEnumerable<Document> GetDocuments()
{
    return Enumerable.Range(1, 10000).Select(x =>
        new Document
        {
            Id = x,
            Name = $"Document {x}" 
        }
    );
}

public class Document
{
    public int Id { get; set; }
    public string Name { get; set; }
}