用JQ填充JSON数组以获得矩形结果

时间:2019-03-01 11:21:35

标签: json csv multidimensional-array jq

我有一个看起来像this的json(链接中的jq播放),我想最终构建如下所示的csv(底部是可复制的示例)。

"SO302993",items1,item2,item3.1,item3.2,item3.3, item3.4,...
"SO302994",items1,item2,item3.1,item3.2,       ,        ,...
"SO302995",items1,item2,item3.1,item3.2,item3.3,        ,...

item3元素位于数组中,是我当前的解决方案:

.[] | [.number, .item1, item2, item3[]?]

给我这个:

"SO302993",items1,item2,item3.1,item3.2,item3.3, item3.4,...
"SO302994",items1,item2,item3.1,item3.2,...
"SO302995",items1,item2,item3.1,item3.2,item3.3,...

这将在csv中创建不均匀的列数。

我尝试以Python风格添加.item3[:]?,但是没有用。

任何帮助将不胜感激!如果我不清楚,请澄清一下!我的代码段和玩具数据在上面的链接中。

{
  "items": [
    {
      "name": "Mr Simon Mackin",
      "country_of_residence": "Scotland",
      "natures_of_control": [
        "voting-rights-25-to-50-percent-limited-liability-partnership",
        "significant-influence-or-control-limited-liability-partnership"
      ],
      "premises": "4"
    }
  ]
}
{
  "items": [
    {
      "name": "Mrs Simonne Mackinni",
      "country_of_residence": "France",
      "natures_of_control": [
        "significant-influence-or-control-limited-liability-partnership"
      ],
      "premises": "4"
    }
  ]
}

使用此查询:

.items[] | [.name, .country_of_residence, .natures_of_control[]?, .premises] | @csv

我得到这个结果

"Mr Simon Mackin","Scotland","voting-rights","significant-influence","4"
"Mrs Simonne Mackinni","France","significant-influence","4"

但是我想得到这个(第二行在“显着影响”之后有一个逗号)。

"Mr Simon Mackin","Scotland","voting-rights","significant-influence","4"
"Mrs Simonne Mackinni","France","significant-influence",,"4"

1 个答案:

答案 0 :(得分:3)

由于要得到矩形结果,因此必须“填充”“ natures_of_control”数组。根据样本输入,您将需要对输入进行“混音”以获得全局最大值。

要填充数组,可以使用辅助函数:

# emit a stream of exactly $n items
def pad($n): range(0;$n) as $i | .[$i];

然后将发布在jqplay上的问题的解决方案变成:

([.[] | .items[] | .natures_of_control | length] | max) as $mx
| .[]
| (.active_count) as $active_count
| (.ceased_count) as $ceased_count
| (.links.self | split("/")[2]) as $companyCode
| .items[]
| [$companyCode, $active_count, $ceased_count, .name, .country_of_residence, .nationality, .notified_on, (.natures_of_control | pad($mx))]
| @csv

调用

适当的调用如下所示:

jq -sr -f program.jq input.json

处理丢失的数据

要忽略没有“项目”的对象,可以对上述内容进行调整,例如如下:

([.[] | .items[]? | .natures_of_control | length] | max) as $mx
 | .[]
 | select(.items)
 | (.active_count) as $active_count
 | (.ceased_count) as $ceased_count
 | (.links.self | split("/")[2]) as $companyCode
 | .items[]
 | [$companyCode, $active_count, $ceased_count, .name, .country_of_residence, .nationality, .notified_on, (.natures_of_control | pad($mx))]
 | @csv