Elasticsearch计算文档中的值

时间:2018-08-01 10:05:01

标签: elasticsearch

由于文档很大,我从MySQL切换到elasticsearch,但是我找不到如何在Elasticsearch中执行某些查询的答案。

我用于数据上传的logstash配置文件如下所示:     输入        {        #读取csv文件        #也使用多行代码        文件           {           start_position =>“开始”           路径=>“ property_all.csv”           sincedb_path =>“ / dev / null”           ignore_older => 0           }        }

filter
   {
   # parse the csv values define fields as integers and floats
   csv
      {
      columns => ["ID1","ID2","tr_00","tr_1G","tr_2G","tr_3G","tr_1S","tr_2S","tr_3S","cont","frame","activity_change"]
      convert => { "activity_change" => "float"}
      remove_field => ["message"]
      }
   grok
      {
      match =>
         {
         "tr_00" => "%{GREEDYDATA:sub_00}>>"
         }
      match =>
         {
         "tr_1G" => "%{GREEDYDATA:sub_1G}>>"
         }
      match =>
         {
         "tr_2S" => "%{GREEDYDATA:sub_2G}>>"
         }
      match =>
         {
         "tr_3G" => "%{GREEDYDATA:sub_3G}>>"
         }
      match =>
         {
         "tr_1S" => "%{GREEDYDATA:sub_1S}>>"
         }
      match =>
         {
         "tr_2S" => "%{GREEDYDATA:sub_2S}>>"
         }
      match =>
         {
         "tr_3S" => "%{GREEDYDATA:sub_3S}>>"
         }
      }
   }

output
   {
   elasticsearch
      {
      hosts => ["localhost:9200"]
      index => ["property_tr"]
      document_type => "_doc"
      manage_template => "false"
      }
   stdout { codec => "dots" }
   }

这很好。我的问题是我想创建查询,并根据出现的字段tr_00和frame进行过滤。这就是说,我只想保留那些在文档中tr_00中给定字符串出现x次数超过x的行,并且给定tr_00中不同的“帧”值出现在文档中的次数应大于y。这有可能吗?在SQL中,我计算了归一化后的未出现次数,并将其硬编码到表中。最后,我想查看sub_00,tr_00,给定tr_00的计数,不同的frame_count,activity_change。

0 个答案:

没有答案