Logstash - 解析和改变JSON文件

时间:2016-10-20 14:04:28

标签: json elasticsearch logstash

拥有以下JSON文件:

{
  "count": 2,
  "status": {"partial": true},
  "records": [
    {
      "info": {
        "startTime": "2016-07-17 08:42:40.212+0000",
        "endTime": "2016-07-17 08:43:47.715+0000",
        "id": "123456789"
      },
      "conversation": {
        "lines": [
          {
            "time": "2016-07-17 08:42:32.533+0000",
            "text": "Hi There",
            "user": "user A"
          },
          {
            "time": "2016-07-17 08:42:36.533+0000",
            "text": "Hello",
            "user": "user B"
          }
        ]
      }
    },
    {
      "info": {
        "startTime": "2016-07-18 08:42:40.212+0000",
        "endTime": "2016-07-18 08:43:47.715+0000",
        "id": "4567890"
      },
      "conversation": {
        "lines": [
          {
            "time": "2016-07-17 08:42:32.533+0000",
            "text": "Hi There",
            "user": "user X"
          },
          {
            "time": "2016-07-17 08:42:36.533+0000",
            "text": "Hello",
            "user": "user Y"
          }
        ]
      }
    }
  ]
}

编辑(原始格式):

{"count": 20,"status": {"partial": true},"records": [{"info": {"startTime": "2016-07-17 08:42:40.212+0000","endTime": "2016-07-17 08:43:47.715+0000","id": "123456789"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user A"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user B"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}},
{"info": {"startTime": "2016-07-18 08:42:40.212+0000","endTime": "2016-07-18 08:43:47.715+0000","id": "4567890"},"conversation": {"lines": [{"time": "2016-07-17 08:42:32.533+0000","text": "Hi There","user": "user X"},{"time": "2016-07-17 08:42:36.533+0000","text": "Hello","user": "user Y"}]}}
]}

我想使用logstash为每个记录导入conversation.lines(忽略其他信息,如info),并且可能运行一些逻辑,例如删除一些行,具体取决于时间财产。

是否可以单独使用Logstash,或者我应该预处理文件吗?

1 个答案:

答案 0 :(得分:2)

我认为最简单的方法是使用node.js。

  1. import re您的JSON文件
  2. 循环遍历require
  3. 循环遍历records
  4. 应用您的逻辑
  5. 使用JS ES客户端
  6. 将每一行发送到ES

    Logstash非常擅长逐行解析文本文件,但是如果要解析多行JSON文件,那么这不是我的首选。