合并Yaml文件并按键值列表的值重新组织

时间:2019-08-21 07:33:18

标签: json yaml jq

我有一个yaml文件列表,每个文件都描述一个项目,并带有密钥sdgs,其中包含代表可持续发展目标的数字列表。

我想合并所有文件,并将它们转换为另一种格式,将sdg索引作为键,并将相关项目作为列表值。

输入:

---
# gnu_health.yaml
description: >
  GNU Health is a Free/Libre project for health practitioners, health
  institutions and governments. It provides the functionality of Electronic
  Medical Record (EMR), Hospital Management (HMIS) and Health Information
  System (HIS).
sdgs: [3]
name: GNU Health 

---
# a11y.yaml
description: >
  This Accessibility Project is a community-driven effort to make web
  accessibility easier by leveraging a worldwide community of developer
  knowledge.
sdgs: [10]
name: A11Y

---
# bahmni.yaml
description: >
  Bahmni is an Open Source hospital Management System focusing
  on poor/underserved and public hospitals in the developing
  world.
  It's aimed to being a generic system which can be used for
  multiple diseases and hospitals in different countries.
sdgs: [1, 3]
name: Bahmni

预期产量

{
  "1": [
    {
      "name": "Bahmni",
      "description: "..."
    }
  ],
  "3": [
    {
      "name": "GNU Health",
      "description: "..."
    },
    {
      "name": "Bahmni",
      "description: "..."
    }
  ],
  "10": [
    {
      "name: "A11Y",
      "description: "..."
    }
  ]
}

即使在阅读了manual和其他awesome-jq资源之后,我仍然发现使用jq的过滤系统很难解决这个问题。

有人可以指出我正确的方向吗?

当前的最大努力:

# use as follow: yq -f $binDir/concat_sdgs.jq $srcDir/*.y*ml

# concat_sdgs.jq
{
  (.sdgs[]|tostring): [.]
}

不幸的是,这不会将来自同一SDG的项目合并在一起

当前不正确的输出

{
  "1": [
    {
      "name": "Bahmni",
      "description: "..."
    }
  ],
  "3": [
    {
      "name": "GNU Health",
      "description: "..."
    }
  ],
  "3": [
    {
      "name": "Bahmni",
      "description: "..."
    }
  ],
  "10": [
    {
      "name: "A11Y",
      "description: "..."
    }
  ]
}

1 个答案:

答案 0 :(得分:3)

好消息是你很亲近。

为简单起见,我将假定已经将.yaml转换为.json。稍微调整一下过滤器,很容易看到:

jq '{ (.sdgs[]|tostring): del(.sdgs) }' a11y.json gnu_health.json bahmni.json

生成四个单键对象流,这些对象与所需内容非常接近。

将它们组合成单个对象有点棘手。为简单起见,我们首先定义一个辅助函数,该函数可用于按键对单键对象进行分组:

  def group_by_keys: reduce .[] as $o ({}; 
     reduce ($o | to_entries[]) as $kv (.; .[$kv.key]

接下来,我们将inputs与-n命令行选项一起使用:

jq -n '
  def group_by_keys: reduce .[] as $o ({}; 
     reduce ($o | to_entries[]) as $kv (.; .[$kv.key] += [$kv.value]));
  [inputs | {(.sdgs[]|tostring): del(.sdgs) }] | group_by_keys

' a11y.json gnu_health.json bahmni.json

(不要忘记-n。)

如果键的顺序很重要,则只需使用以下过滤器:

def sort_by_keys:
  to_entries
  | sort_by(.key|tonumber)
  | from_entries;