在单个filebeat消息中的有效json行之间发送连续的无效json行

时间:2017-01-04 12:19:42

标签: logstash elastic-stack filebeat

我有一个文件,其中包含行分隔的json对象以及非json数据(stderr stacktraces)。

{"timestamp": "20170104T17:10:39", "retry": 0, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:40", "retry": 1, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:41", "retry": 2, "level": "info", "event": "failed to download"}
Traceback (most recent call last):
  File "a.py", line 12, in <module>
    foo()
  File "a.py", line 10, in foo
    bar()
  File "a.py", line 4, in bar
    raise Exception("This was unexpected")
Exception: This was unexpected
{"timestamp": "20170104T17:10:42", "retry": 3, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:43", "retry": 4, "level": "info", "event": "failed to download"}

使用以下配置,我能够正确获取有效的json行,但是无条件的json正在逐个发送(逐行)。

filebeat.yml

filebeat.prospectors:
  - input_type: log
    document_type: mytype
    json:
      message_key: event
      add_error_key: true
    paths:
        - /tmp/*.log

output:
  console:
    pretty: true

  file:
    path: "/tmp/filebeat"
    filename: filebeat

输出:

{
  "@timestamp": "2017-01-04T12:03:36.659Z",
  "beat": {
    "hostname": "...", "name": "...", "version": "5.1.1"
  },
  "input_type": "log",
  "json": {
    "event": "failed to download",
    "level": "info",
    "retry": 2,
    "timestamp": "20170104T17:10:41"
  },
  "offset": 285,
  "source": "/tmp/test.log",
  "type": "mytype"
}
{
  "@timestamp": "2017-01-04T12:03:36.659Z",
  "beat": {
    "hostname": "...", "name": "...", "version": "5.1.1"
  },
  "input_type": "log",
  "json": {
    "event": "Traceback (most recent call last):",
    "json_error": "Error decoding JSON: invalid character 'T' looking for beginning of value"
  },
  "offset": 320,
  "source": "/tmp/test.log",
  "type": "mytype"
}
  

我希望将所有非json线路连接到一条新的json线路   消息。

使用multiline,我尝试了以下

filebeat.prospectors:
  - input_type: log
    document_type: mytype
    json:
      message_key: event
      add_error_key: true
    paths:
        - /tmp/*.log
    multiline:
      pattern: '^{'
      negate: true
      match: after

output:
  console:
    pretty: true

  file:
    path: "/tmp/filebeat"
    filename: filebeat

但它似乎没有起作用。它对event中指定的json.message_key键值执行多线规则。

docs here我理解为什么会这样 json.message_key -

  

应用线路过滤和多行设置的JSON密钥。   此键必须为顶级,其值必须为字符串,否则为   被忽略了。如果没有定义文本键,则行过滤和   多行功能无法使用。

还有其他方法可以将连续的非json线连接到一条消息中吗?

我希望在将整个堆栈跟踪发送到logstash之前捕获它。

1 个答案:

答案 0 :(得分:2)

Filebeat在 JSON解析之后应用多行分组,因此多行模式不能基于组成JSON对象的字符(例如{)。

在Filebeat中,还有另一种方法可以进行JSON解析,以便在多行分组之后进行JSON解析,因此您的模式可以包含JSON对象字符。您需要Filebeat 5.2(即将发布),因为target字段已添加到decode_json_fields处理器,以便您可以指定解码的json字段将添加到事件的位置。

filebeat.prospectors:
- paths: [input.txt]
  multiline:
    pattern: '^({|Traceback)'
    negate:  true
    match:   after

processors:
- decode_json_fields:
    when.regexp:
      message: '^{'
    fields: message
    target:
- drop_fields:
    when.regexp:
      message: '^{'
    fields: message

我使用Golang游乐场测试了多线模式here

Filebeat产生以下输出(使用上面给出的日志行作为输入)。 (我使用了主分支的构建。)

{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":95,"retry":0,"source":"input.txt","timestamp":"20170104T17:10:39","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":190,"retry":1,"source":"input.txt","timestamp":"20170104T17:10:40","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":285,"retry":2,"source":"input.txt","timestamp":"20170104T17:10:41","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"input_type":"log","message":"Traceback (most recent call last):\n  File \"a.py\", line 12, in \u003cmodule\u003e\n    foo()\n  File \"a.py\", line 10, in foo\n    bar()\n  File \"a.py\", line 4, in bar\n    raise Exception(\"This was unexpected\")\nException: This was unexpected","offset":511,"source":"input.txt","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":606,"retry":3,"source":"input.txt","timestamp":"20170104T17:10:42","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":702,"retry":4,"source":"input.txt","timestamp":"20170104T17:10:43","type":"log"}