如何不通过logstash解析某些字段?

时间:2017-02-08 12:48:50

标签: json ruby string elasticsearch logstash

我有一个看起来像这样的日志文件(简化):

 { "startDate": "2015-05-27", "endDate": "2015-05-27", 
    "request" : {"requestId":"123","field2":1,"field2": 2,"field3":3, ....} }

Log-stash尝试parse所有字段,包括字段&#34; request&#34;。但是有可能不解析这个领域吗? 我想看到&#34;请求&#34; <{1}}中的字段,但不应该对其进行解析。

这是我配置文件的一部分:

elastic-search

这是我的模板文件:

input {
    file {
        type => "json"
        path => [
                "/var/log/service/restapi.log"
        ]
        tags => ["restapi"]
    }
}

filter {
    ruby {
        init => "require 'socket'"
        code => "
           event['host'] = Socket.gethostname.gsub(/\..*/, '')
           event['request'] = (event['request'].to_s);
        "
    }

    if "restapi" in [tags] {
        json {
            source => "message"
        }
        date {
                match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
                target => "date_start"
         }
        date {
                match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
                target => "date_end"
        }
        date {
                match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
                target => "date"
        }
    }
}
output {
    if "restapi" in [tags] {
        elasticsearch {
            hosts => ["......."]
            template_name => "logs"
            template => "/etc/logstash/templates/service.json"
            template_overwrite => true
            index => "service-logs-%{+YYYY.MM.dd}"
            idle_flush_time => 20
            flush_size => 500
        }
    }
}

这是来自logstash.log

{
  "template" : "service-*",
  "settings" : {
    "index": {
            "refresh_interval": "60s",
            "number_of_shards": 6,
            "number_of_replicas": 2
        }
  },
  "mappings" : {
    "logs" : {
        "properties" : {
        "@timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
        "@version" : { "type" : "integer", "index" : "not_analyzed" },
        "message": { "type" : "string", "norms" : { "enabled" : false } },
        "method" : { "type" : "string", "index" : "not_analyzed" },
        "traffic_source" : { "type" : "string", "index" : "not_analyzed" },
        "request_path" : { "type" : "string", "index" : "not_analyzed" },
        "status" : { "type" : "integer", "index" : "not_analyzed" },
        "host_name" : { "type" : "string", "index" : "not_analyzed" },
        "environment" : { "type" : "string", "index" : "not_analyzed" },
        "action" : { "type" : "string", "index" : "not_analyzed" },
        "request_id" : { "type" : "string", "index" : "not_analyzed" },
        "date" : { "type" : "date", "format" : "dateOptionalTime" },
        "date_start" : { "type" : "date", "format" : "dateOptionalTime" },
        "date_end" : { "type" : "date", "format" : "dateOptionalTime" },
        "adnest_type" : { "type" : "string", "index" : "not_analyzed" },
        "request" : { "type" : "string", "index" : "not_analyzed" }
      }
    }
  }
}

2 个答案:

答案 0 :(得分:1)

您应该可以使用ruby过滤器执行此操作:

filter {
    ruby {
        init => "require 'socket'"
        code => "
           event['host'] = Socket.gethostname.gsub(/\..*/, '')
           event['request'] = (event['request'].to_s);
        "
    }

    if "restapi" in [tags] {
        ruby {
                code => '
                    require "json"
                    event.set("request",event.get("request").to_json)'
        }
        date {
                match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
                target => "date_start"
         }
        date {
                match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
                target => "date_end"
        }
        date {
                match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
                target => "date"
        }
    }
}

使用stubbed stdin / stdout测试时:

input {
 stdin { codec => json }
}
// above filter{} block here
output {
  stdout { codec=>rubydebug}
}

像这样测试:

echo '{ "startDate": "2015-05-27", "endDate": "2015-05-27", "request" : {"requestId":"123","field2":1,"field2": 2,"field3":3} }' | bin/logstash -f test.conf

输出:

{
     "startDate" => "2015-05-27",
       "endDate" => "2015-05-27",
       "request" => "{\"requestId\"=>\"123\", \"field2\"=>2, \"field3\"=>3}",
      "@version" => "1",
    "@timestamp" => "2017-02-09T14:37:02.789Z",
          "host" => "xxxx"
}

所以我已经回答了你原来的问题。如果你无法弄清楚你的模板无效的原因,你应该问另一个问题。

答案 1 :(得分:0)

ElasticSearch默认分析该字段。 如果您需要的只是不分析request字段,请通过在字段映射中设置"index": "not-analyzed"来更改索引的方式。

文档here

中的更多信息