在Logstash中解析/拆分嵌套的单个JSON数组

时间:2017-01-18 17:21:05

标签: arrays json nested logstash

我正在寻找以下JSON数组的拆分/过滤器。 我们需要将数组中的每个值作为弹性体中的单个值。

  

{" Mot_Temp_Test" :{" INT" :[" 0"," 0"," 0"," 0"," 0",&#34 ; 0"," 0"," 0"," 0"," 0"," 0" ," 0" ]}}

1 个答案:

答案 0 :(得分:0)

(这些是我使用logstash 2.4运行的测试结果,输出是rubydebug编解码器)

通过在输入中使用codec => "json",logstash实际上会将您的数组视为数组。我已经把你的注意事项分开来告诉他们。

{
    "Mot_Temp_Test" => {
        "INT" => [
            [ 0] "0",
            [ 1] "1",
            [ 2] "2",
            [ 3] "3",
            [ 4] "4",
            [ 5] "5",
            [ 6] "6",
            [ 7] "7",
            [ 8] "8",
            [ 9] "9",
            [10] "10",
            [11] "11"
        ]
    },
         "@version" => "1",
       "@timestamp" => "2017-01-20T16:55:42.606Z",
             "host" => "b5963373fadd"
}

Logstash在处理数组方面不是很出色,但它可以访问它们。因此,我们可以使用mutate filter将数组元素重命名为字段。

filter {
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int0" } }
}

给我们:

{
    "Mot_Temp_Test" => {
        "INT" => [
            [ 0] "0",
            [ 1] "0",
            [ 2] "0",
            [ 3] "0",
            [ 4] "0",
            [ 5] "0",
            [ 6] "0",
            [ 7] "0",
            [ 8] "0",
            [ 9] "0",
            [10] "0"
        ]
    },
         "@version" => "1",
       "@timestamp" => "2017-01-20T17:08:00.728Z",
             "host" => "5780e869e09f",
             "int0" => "0"
}

好的,这应该很简单。 。

filter {
    mutate { 
        rename => { "[Mot_Temp_Test][INT][0]" => "int0" } 
        rename => { "[Mot_Temp_Test][INT][1]" => "int1" } 
        rename => { "[Mot_Temp_Test][INT][2]" => "int2" } 
        rename => { "[Mot_Temp_Test][INT][3]" => "int3" } 
        rename => { "[Mot_Temp_Test][INT][4]" => "int4" } 
        rename => { "[Mot_Temp_Test][INT][5]" => "int5" } 
        rename => { "[Mot_Temp_Test][INT][6]" => "int6" } 
    }
}

但是等等,这些操作是逐个处理的,在删除某些内容之后,数组会填充并得到:

{
    "Mot_Temp_Test" => {
        "INT" => [
            [0] "1",
            [1] "3",
            [2] "5",
            [3] "7",
            [4] "9",
            [5] "11"
        ]
    },
         "@version" => "1",
       "@timestamp" => "2017-01-20T18:48:31.875Z",
             "host" => "a802749c44fe",
             "int0" => "0",
             "int1" => "2",
             "int2" => "4",
             "int3" => "6",
             "int4" => "8",
             "int5" => "10"
}

试图解释这个问题:

filter {
    mutate { 
        rename => { "[Mot_Temp_Test][INT][0]" => "int0" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int1" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int2" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int3" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int4" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int5" } 
        rename => { "[Mot_Temp_Test][INT][0]" => "int6" } 
    }
}

不完全有效:

{
                                                           "Mot_Temp_Test" => {
        "INT" => [
            [ 0] "1",
            [ 1] "2",
            [ 2] "3",
            [ 3] "4",
            [ 4] "5",
            [ 5] "6",
            [ 6] "7",
            [ 7] "8",
            [ 8] "9",
            [ 9] "10",
            [10] "11"
        ]
    },
                                                                "@version" => "1",
                                                              "@timestamp" => "2017-01-20T18:56:32.608Z",
                                                                    "host" => "d5b81003f43b",
    "\"int0\", \"int1\", \"int2\", \"int3\", \"int4\", \"int5\", \"int6\"" => "0"
}

为了实现这一点,我们需要使用一堆不同的mutate过滤器:

filter {
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int0" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int1" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int2" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int3" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int4" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int5" } }
    mutate { rename => { "[Mot_Temp_Test][INT][0]" => "int6" } }
}

成功:

{
    "Mot_Temp_Test" => {
        "INT" => [
            [0] "7",
            [1] "8",
            [2] "9",
            [3] "10",
            [4] "11"
        ]
    },
         "@version" => "1",
       "@timestamp" => "2017-01-20T18:21:06.488Z",
             "host" => "882832d1dd43",
             "int0" => "0",
             "int1" => "1",
             "int2" => "2",
             "int3" => "3",
             "int4" => "4",
             "int5" => "5",
             "int6" => "6"
}

总而言之,数组是logstash不擅长的东西。