Logstash拆分过滤器

时间:2018-08-03 14:36:11

标签: elasticsearch logstash

最近,我发现我可以通过直接提供URL直接从Logstash合并数据。提取输入的效果很好,但是它会将完整的文档下载并加载到ES中。

我想为每行创建一个新的弹性搜索记录。默认情况下,整个文件会加载到消息字段中,这会减慢“发现”选项卡等中的Kibana加载。

木假名输出:

{
  "_index": "blacklists",
  "_type": "default",
  "_id": "pf3k_2QB9sEBYW4CK4AA",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-08-03T13:05:00.569Z",
    "tags": [
      "_jsonparsefailure",
      "c2_info",
      "ipaddress"
    ],
    "@version": "1",
    "message": "#############################################################\n## Master Feed of known, active and non-sinkholed C&Cs IP \n## addresses\n## \n## HIGH-CONFIDENCE FAMILIES ONLY\n## \n## Feed generated at: 2018-08-03 12:13 \n##\n## Feed Provided By: John Bambenek of Bambenek Consulting\n## jcb@bambenekconsulting.com // http://bambenekconsulting.com\n## Use of this feed is governed by the license here: \n## http://osint.bambenekconsulting.com/license.txt,
    "client": "204.11.56.48",
    "http_poller_metadata": {
      "name": "bembenek_c2",
      "host": "node1",
      "request": {
        "method": "get",
        "url": "http://osint.bambenekconsulting.com/feeds/c2-ipmasterlist-high.txt"
      },
      "response_message": "OK",
      "runtime_seconds": 0.27404,
      "response_headers": {
        "content-type": "text/plain",
        "accept-ranges": "bytes",
        "cf-ray": "4448fe69e02197ce-FRA",
        "date": "Fri, 03 Aug 2018 13:05:05 GMT",
        "connection": "keep-alive",
        "last-modified": "Fri, 03 Aug 2018 12:13:44 GMT",
        "server": "cloudflare",
        "vary": "Accept-Encoding",
        "etag": "\"4bac-57286dbe759e4-gzip\""
      },
      "code": 200,
      "times_retried": 0
    }
  },
  "fields": {
    "@timestamp": [
      "2018-08-03T13:05:00.569Z"
    ]
  },
  "sort": [
    1533301500569
  ]
}

Logstash配置:

input {
  http_poller {
    urls => {
      bembenek_c2 => "http://osint.bambenekconsulting.com/feeds/c2-ipmasterlist-high.txt"
      bembenek_c2dom => "http://osint.bambenekconsulting.com/feeds/c2-dommasterlist-high.txt"
      blocklists_all => "http://lists.blocklist.de/lists/all.txt"
    }
    request_timeout => 30
    codec => "json"
    tags => c2_info
    schedule => { cron => "*/10 * * * *"}
    metadata_target => "http_poller_metadata"
  }
}

filter {
        grok {
                match => { "message" => [
                                "%{IPV4:ipaddress}" }
                add_tag => [ "ipaddress" ]
        }
}

output {
 stdout { codec => dots }
    elasticsearch {
        hosts =>  ["10.0.50.51:9200"]
        index => "blacklists"
        document_type => "default"
        template_overwrite => true
    }
   file {
        path           => "/tmp/blacklists.json"
        codec          => json {}
    }
}

有人知道如何用“ \ n”分割加载的文件吗?

我尝试过

filter {
split {
terminator => "\n"
}
}

如何使用此过滤器的文档和示例并不流行。

1 个答案:

答案 0 :(得分:2)

缺少的过滤器是:

filter {
split {
        field => "[message]"
}
}

我们不必指定终止符,因为根据Logstash 6.3文档默认将其设置为“ \ n”。