过滤弹性搜索中特定字段的查询

时间:2015-12-29 16:59:05

标签: python elasticsearch http-post elasticsearch-plugin

我有这样的文件

  {
    _index: "logstash-2015.11.30",
    _type: "hadoopgeneric",
    _id: "AVFVsF6ypMu_z_qvIUgL",
    _score: null,
    _source: {
             @timestamp: "2015-11-30T00:00:00.017Z",
             message: "selector : 48 - Element found for using multiple selectors using query .js-product-brand.product-brand",
             @version: "1",
             host: "ip-x-x-x-x",
             path: "/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr",
             type: "hadoopgeneric",
             thread_id: "15119",
             thread_name: "MainThread",
             component_name: "Page",
             severity: "DEBUG",
             env: "STG",
             role: "spider",
             ip: "x.x.x.x",
             tags: [
                 "processed"
             ]
            },
   }

我必须过滤那些路径为/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr的文档(特别是path字段)

我尝试了这种常规搜索查询http://localhost:9200/logstash-*/_search?pretty=true&q="/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr"&sort=@timestamp&size=100000

它给了我结果,但现在我想通过触发此查询来尝试仅在path字段中搜索(我在此查询中没有得到任何结果) - http://localhost:9200/logstash-*/_search?pretty=true&q="path: /logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr"&sort=@timestamp&size=100000

我正在阅读有关弹性搜索的文档Term Query。但我不知道如何在弹性搜索中传递post参数等查询。我正在使用python库向弹性搜索发出帖子请求

以下是我到目前为止所尝试的内容

esurl = http://localhost:9200/logstash-*/_search
r = requests.post(esurl,data={"term":{'path':'/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr'}})
r.text

{"error":"SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[5D_RNDQPRf6xyLO1suIoCA][logstash-2015.11.30][0]: RemoteTransportException[[ip-x-x-x-x-elkstorage][inet[/x.x.x.x:9300]][indices:data/read/search[phase/query]]]; nested: SearchParseException[[logstash-2015.11.30][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [_na_]]]; nested: ElasticsearchParseException[Failed to derive xcontent]; }{[o8jLb8P5SWOfsCo78eUlHg][logstash-2015.12.01][0]: RemoteTransportException[[ip-x-x-x-x-elkstorage][inet[/x.x.x.x:9300]][indices:data/read/search[phase/query]]]; nested: SearchParseException[[logstash-2015.12.01][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [_na_]]]; nested: ElasticsearchParseException[Failed to derive xcontent];}

2 个答案:

答案 0 :(得分:2)

q参数似乎是错误的("字符位置错误),试试这个:

http://localhost:9200/logstash-*/_search?pretty=true&q=path:"/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr"&sort=@timestamp&size=100000

另一方面,术语查询有效但必须在query密钥内,例如:

import requests
import json

esurl = "http://localhost:9200/logstash-*/_search"
r = requests.post(esurl,data=json.dumps({"query": {"term":{'path':'/logs/stats/container/application_1448508514184_0178/container_e06_1448508514184_0178_01_003568/stderr'}}}))
r.text

答案 1 :(得分:0)

正确构建Elasticsearch的DSL查询是一件痛苦的事。它很容易弄错。对于大多数用例,我只使用Head插件中的query-builder或SQL-to-ES插件。

两者都提供了一个用于生成查询的简单UI - 您可以将结果转换为json并在代码中使用它。

这需要安装一些工作,但如果你需要制定大量的ES查询,那真的是值得的。

head plugin - 不仅仅是构建查询。

sql plugin