Question

我在日志中有以下内容，我想使用ElasticSearch查询来查找：

2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="abc123"}
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="def123"}

由于上面的每一行都在elasticsearch中表示为单个“message”，因此我很难使用POST rest调用来查询它。我尝试使用下面的“必须匹配”来获取我的日志的第1行，但它不一致，有时它会返回多次点击而不是只有1次点击：

{
   "query" : {
      "constant_score" : { 
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, 
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match" : {"message" : "received, {\"uuid\"=\"abc123\"}"}} 
              ]
           }
         }
      }
   }
}

我在上面的elasticsearch查询中做错了吗？我认为“必须”等于AND，“匹配”更多是CONTAINS，“match_phrase_prefix”是STARTSWITH？有人可以告诉我如何正确查询填充上述日志的日志与不同的uuid号码，只返回单击？最初我以为我得到了上面的查询，它首先返回只有1次击中，然后它返回2然后更多...。对我来说是不一致的。提前谢谢!!

Answer 1

问题在于bool查询的第3个子句。让我给你几个问题，这些问题对你有用，我会解释他们为什么会这样做。

首次质询

curl -XGET http://localhost:9200/my_logs/_search -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}},
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match" : {
                     "message" : {
                        "query": "received, {\"uuid\"=\"abc123\"", 
                        "operator": "and"
                     }
                   }
                 }
              ]
           }
         }
      }
   }
}'

说明的

让我们确保我们在关于索引的同一页面上。默认情况下，索引器将通过标准分析链传递您的数据。即通过空格分割，减少特殊字符，制作较低的套管等等。因此，在指数中，我们只需要有其位置的标记。

Match query全文查询将采用您的查询文字＆＃34; received, {\"uuid\"=\"abc123\"＆＃34;并将通过分析。默认情况下，此分析只是按空格分割文本，减少特殊字符，制作较低的大小等。此分析的结果与此类似（简化）：received，uuid，{{ 1}}。

match query会做什么 - 它会使用默认abc123（message）将这些令牌与operator字段合并。因此，作为逻辑表达式，最后一个子句（or）将如下所示：match-query。

这就是前4个日志条目匹配的原因。我能够重现它。

第二次查询

message:received OR message:uuid OR message:abc123

说明的

这个更简单一点。请记住：我们的索引过程会留下令牌及其在索引中的位置。

实际上Match Phrase Prefix查询正在做什么 - 它需要输入查询（让我们采取＆＃34; curl -XGET http://localhost:9200/my_logs/_search -d ' { "query" : { "constant_score" : { "filter" : { "bool" : { "must" : [ { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}}, { "match_phrase_prefix" : {"received, {\"uuid\"=\"abc123\""}} ] } } } } }'＆＃34;作为示例），做出完全相同的步骤查询文本分析。并尝试在索引中的相邻位置找到令牌received, {\"uuid\"=\"abc123\"，received，uuid。只是按照相同的顺序排列：abc123 - ＆gt; received - ＆gt; uuid（差不多）。

除了最后一个令牌，在我们的例子中是abc123。准确地说，它将为最后一个令牌制作通配符。即abc123 - ＆gt; received - ＆gt; uuid。

要成为完美主义者，请添加abc123* - ＆gt; received - ＆gt; uuid（即最后没有通配符） - 实际是Match Phrase查询正在做什么。它还计算索引中的位置，即尝试匹配＆＃39;短语＆＃39;而不仅仅是随机位置的单独标记。

日志

1 个答案: