如何过滤减少输入'在一大堆物体上?

时间:2016-08-25 02:50:10

标签: filtering jq

我使用它来累积唯一键的映射,其值是聚合计数和持续时间总计。目前,它通过“减少输入”来运行每个输入。

reduce inputs as $r
({};
("Pipeline:" + $r.m."topic.type") as $topic
| ("Channel:" + $r.channel) as $channel
| ("Campaign:" + $r.campaign) as $campaign
| ("Cellcode:" + $r.cellcode) as $cellcode
| ("Tracking:" + $r.tracking) as $tracking
| ("Template:" + $r.m."template.id") as $template
| ("Event:" + $r.name) as $event
| ("Reason:" + $r.reason) as $reason
| ($r.duration|tonumber) as $duration
| (($topic + ":" + $channel + ":" + $campaign + ":" + $cellcode + ":" + $tracking + ":" + $template + ":" + $event + ":" + $reason) as $key
  | .[$key][0] += 1 | .[$key][1] += $duration)

我无法弄清楚在哪里放置一个select()过滤器,以便只对那些通过' select($ r.type ==" AUDIT_CHANNEL")的条目进行缩减。 #39;检查,以便跳过2"键入":" AUDIT_SYSTEM"此测试数据中的事件:

{"type":"AUDIT_CHANNEL","name":"DROPPED","reason":"INVALID_MAIL_META_DATA","start":"1472083067058","duration":"91","end":"1472083067149","dc":"dev","pool":"raptor-app","host.name":"L-SEA-10002721","host.ip":"10.236.67.80","rlogid":"tfsqiu.dvw9%3FJ*P%40G*25671246-156befd00b2-0x293","channel":"EMAIL","m":{"audited":"1472083067058","created":"1472083066974","enabled":"true","entity.common.version":"1","template.id":"2840df6d-d9e8-4f27-e8b5-918c122d4561","template.version":"17","topic.curname":"eddude-default-topic","topic.curtype":"DEFAULT","topic.dc":"LVS","topic.name":"eddude-default-topic","topic.part":"5","topic.type":"DEFAULT"},"id":"0AEC4350-1C6E2FC9B80-0156BEF9ED92-0000000000000003","campaign":"999","contract":"a5872a5c-8912-dd63-583f-61fa8db3efde","user":1276847275,"cellcode":"","age":"175"}

{"type":"AUDIT_SYSTEM","name":"ROTATED","start":"1472083081033","duration":"0","end":"1472083081033","dc":"dev","pool":"raptor-app","host.name":"L-SEA-10002721","host.ip":"10.236.67.80","rlogid":"tfsqiu.dvw9%3FJ*P%40G*25671246-156befd3749-0xce"}

{"type":"AUDIT_SYSTEM","name":"ROTATED","start":"1472083141034","duration":"0","end":"1472083141034","dc":"dev","pool":"raptor-app","host.name":"L-SEA-10002721","host.ip":"10.236.67.80","rlogid":"tfsqiu.dvw9%3FJ*P%40G*25671246-156befe21aa-0xce"}

{"type":"AUDIT_CHANNEL","name":"RECEIVED","start":"1472083158860","duration":"109","end":"1472083158969","dc":"dev","pool":"raptor-app","host.name":"L-SEA-10002721","host.ip":"10.236.67.80","rlogid":"tfsqiu.dvw9%3FJ*P%40G*25671246-156befe674c-0x10f","channel":"EMAIL","m":{"audited":"1472083158860","created":"1472083158860","enabled":"true","entity.common.version":"1","template.id":"2840df6d-d9e8-4f27-e8b5-918c122d4561","template.version":"17","topic.curname":"eddude-default-topic","topic.curtype":"DEFAULT","topic.dc":"LVS","topic.name":"eddude-default-topic","topic.part":"5","topic.type":"DEFAULT"},"id":"0AEC4350-1C6E2FC9B80-0156BEF9ED92-0000000000000004","campaign":"999","contract":"a5872a5c-8912-dd63-583f-61fa8db3efde","user":1276847275,"cellcode":"","age":"109"}

我尝试将它放在reduce之前,在reduce之内等,但我没有得到所需的输出:

{
  "Pipeline:DEFAULT:Channel:EMAIL:Campaign:999:Cellcode::Tracking::Template:2840df6d-d9e8-4f27-e8b5-918c122d4561:Event:DROPPED:Reason:INVALID_MAIL_META_DATA": [
    1,
    91
  ],
  "Pipeline:DEFAULT:Channel:EMAIL:Campaign:999:Cellcode::Tracking::Template:2840df6d-d9e8-4f27-e8b5-918c122d4561:Event:RECEIVED:Reason:": [
    1,
    109
  ]
}

我是否必须完全在reduce运行之外执行过滤,或者我只是不知道如何使用单个filter-and-reduce进行此操作?

顺便说一句,假设这个输入是数百万条记录的巨大流,有几百个独特的密钥"计算累积到。

1 个答案:

答案 0 :(得分:1)

inputs将为每个输入的输入生成结果。您希望按类型过滤这些输入,以便将过滤器放在那里:

reduce (inputs | select(.type == "AUDIT_CHANNEL")) as $r ...

我会这样编写你的过滤器:

reduce (inputs | select(.type == "AUDIT_CHANNEL")) as $r ({};
    ([
        "Pipeline", $r.m."topic.type",
        "Channel",  $r.channel,
        "Campaign", $r.campaign,
        "Cellcode", $r.cellcode,
        "Tracking", $r.tracking,
        "Template", $r.m."template.id",
        "Event",    $r.name,
        "Reason",   $r.reason
    ] | join(":")) as $key
    | .[$key] |= [ .[0]+1, .[1]+($r.duration|tonumber) ]
)