NEST弹性查询可以工作几个小时然后停止

时间:2017-09-12 11:45:22

标签: elasticsearch kibana nest

我对Elasticsearch 5.4和NEST 5.4.0有一个非常奇怪的情况。我编写了一个简单的C#控制台应用程序,每分钟查询一次Elastic并返回命中/文档并将它们存储在Postgres数据库中以供进一步处理。它可以工作几个小时,然后开始返回带有零文档的有效.DebugInformation的查询,但我可以在Kibana Dev Tools中复制并运行相同的查询并获得结果。当我停止控制台应用程序并重新启动它然后它成功查询并返回命中,一切都很好。以下是代码示例和日志条目。我想弄清楚为什么它会在一段时间后停止工作。我正在使用带有NEST的.NET Core C#Console应用程序。

我不确定.DebugInformation当时是否返回有关ES运行状况的任何信息,以查看当时是否存在ES群集问题,如429s。我查看了elasticsearch.log并且只显示了插入内容。我不确定是否有地方可以找到查询问题。

有没有人有NEST工作正常然后停止的问题?

这是一个包含两次运行的查询日志。第一个运行正常并返回9行(由于敏感数据我删除了样本中除了一个之外的所有行),然后它再次运行但返回零点击。之后的所有查询都没有命中,直到我再次重新启动C#代码。相同的开始和结束日期输入,我在Elastic ....中获得真实数据。

2017-09-12 16:41:59.799 -05:00 [Information] Dates: Start 9/12/2017 4:41:00 PM End 9/12/2017 4:42:00 PM
2017-09-12 16:41:59.800 -05:00 [Debug] AlertService._queryErrors: 9/12/2017 4:41:00 PM End 9/12/2017 4:42:00 PM
2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._elasticQueryLogErrors: elasticQuery {
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:41:00Z",
                                                    "lte": "2017-09-12T21:42:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._elasticQueryLogErrors: searchResponse 9 : Valid NEST response built from a successful low level call on POST: /filebeat-%2A/_search
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://servername:9200/ Took: 00:00:00.0112120
# Request:
{"from":0,"query":{
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:41:00Z",
                                                    "lte": "2017-09-12T21:42:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
# Response:
{"took":7,"timed_out":false,"_shards":{"total":215,"successful":215,"failed":0},"hits":{"total":9,"max_score":0.0,"hits":[{"_index":"filebeat-2017.09.12","_type":"log","_id":"AV54Cdl2yay890uCUru4","_score":0.0,"_source":{"offset":237474,"target_url":"...url...","input_type":"log","source":"....source....","type":"log","tags":["xxx-001","beats_input_codec_plain_applied","@timestamp":"2017-09-12T21:41:02.000Z","@version":"1","beat":{"hostname":"xxx-001","name":"xxx-001","version":"5.4.3"},"host":"xxx-001","timestamp":"09/12/2017 16:41:02","error_data":"EXCEPTION, see detail log"}]}

2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._queryErrors: (result) System.Collections.Generic.List`1[XX.Alerts.Core.Models.FilebeatModel]
2017-09-12 16:41:59.811 -05:00 [Information] ErrorCount: 9

2017-09-12 16:42:00.222 -05:00 [Information] Dates: Start 9/12/2017 4:42:00 PM End 9/12/2017 4:43:00 PM
2017-09-12 16:42:00.222 -05:00 [Debug] AlertService._queryErrors: 9/12/2017 4:42:00 PM End 9/12/2017 4:43:00 PM
2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._elasticQueryLogErrors: elasticQuery {
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:42:00Z",
                                                    "lte": "2017-09-12T21:43:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._elasticQueryLogErrors: searchResponse 0 : Valid NEST response built from a successful low level call on POST: /filebeat-%2A/_search
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://servername:9200/ Took: 00:00:00.0066742
# Request:
{"from":0,"query":{
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:42:00Z",
                                                    "lte": "2017-09-12T21:43:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
# Response:
{"took":4,"timed_out":false,"_shards":{"total":215,"successful":215,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._queryErrors: (result) System.Collections.Generic.List`1[Q2.Alerts.Core.Models.FilebeatModel]
2017-09-12 16:42:00.229 -05:00 [Information] ErrorCount: 0

这是我的NEST查询

    public IEnumerable<FilebeatModel> _elasticQueryLogErrors(DateTime startDate, DateTime endDate)
    {
        //var startDateString = startDate.Kind;
        //var endDateString = endDate.Kind;

        var searchQuery = @"{
                ""bool"": {
                    ""filter"":
                        [ {
                            ""range"":
                            { ""@timestamp"": { ""gte"": """ + string.Format("{0:yyyy-MM-ddTHH:mm:ssZ}", startDate.ToUniversalTime()) +
                    @""",
                                                ""lte"": """ + string.Format("{0:yyyy-MM-ddTHH:mm:ssZ}", endDate.ToUniversalTime()) + @""" }
                            }
                          },
                          {
                            ""exists"" : { ""field"" : ""error_data"" }
                          }
                        ]
                    } }";

        var searchResponse = _es.Search<FilebeatModel>(s => s
            .AllTypes()
            .From(0)
            .Query(query => query.Raw(searchQuery)));

        _logger.LogDebug("AlertService._elasticQueryLogErrors: elasticQuery " + searchQuery);

        _logger.LogDebug("AlertService._elasticQueryLogErrors: searchResponse " + searchResponse.Hits.Count + " : " + searchResponse.DebugInformation);

        foreach (var searchResponseHit in searchResponse.Hits)
        {
            searchResponseHit.Source.Id = searchResponseHit.Id;
        }

        return searchResponse.Documents.ToList();
    }

这是我在循环中运行上面代码的类的构造函数。循环可能会持续数小时或数天。这可能是我的问题的关键是如何长时间建立连接。当我关闭并重新打开在错过的时间段内运行查询的应用程序时,它们运行得很好。

    public AlertService(IOptions<ElasticConfig> elasticConfig, AlertsDbContext context, ILogger<AlertService> logger)
    {
        _logger = logger;

        _logger.LogDebug(" *** Entering AlertService");
        string elasticConnectionString = elasticConfig.Value.ConnectionString;
        string defaultIndex = elasticConfig.Value.IndexName;

        var settings = new ConnectionSettings(
                new Uri(elasticConnectionString))
            .ConnectionLimit(-1)
            .DisableDirectStreaming()
            .DefaultIndex(defaultIndex);

        _es = new ElasticClient(settings);
        _context = context;
    }

1 个答案:

答案 0 :(得分:1)

我已经确认这是一个由我自己创建的竞争条件,因为内部计时器在评论中指出的对Elastic的调用上爬行。它不是NEST中的错误,而只是我的代码及其时间。我已经使用System.Threading.Timer将调用与每一次回调对齐,并且它可以正常工作。感谢Val的帮助