如何使用jq将两个文件中的数组合并到一个数组中?

时间:2018-02-28 20:25:52

标签: json join jq

我想合并两个包含JSON的文件。它们每个都包含一组JSON对象。

registration.json

[
    { "name": "User1", "registration": "2009-04-18T21:55:40Z" },
    { "name": "User2", "registration": "2010-11-17T15:09:43Z" }
]

useredits.json

[
    { "name": "User1", "editcount": 164 },
    { "name": "User2", "editcount": 150 },
    { "name": "User3", "editcount": 10 }
]

在理想情况下,我希望通过合并操作得到以下结果:

[
    { "name": "User1", "editcount": 164, "registration": "2009-04-18T21:55:40Z" },
    { "name": "User2", "editcount": 150, "registration": "2010-11-17T15:09:43Z" }
]

我找到了https://github.com/stedolan/jq/issues/1247#issuecomment-348817802,但我得到了

jq: error: module not found: jq

3 个答案:

答案 0 :(得分:2)

jq 解决方案:

jq -s '[ .[0] + .[1] | group_by(.name)[] 
          | select(length > 1) | add ]' registration.json useredits.json

输出:

[
  {
    "name": "User1",
    "registration": "2009-04-18T21:55:40Z",
    "editcount": 164
  },
  {
    "name": "User2",
    "registration": "2010-11-17T15:09:43Z",
    "editcount": 150
  }
]

答案 1 :(得分:1)

虽然没有严格回答问题,但是下面的命令

jq -s 'flatten | group_by(.name) | map(reduce .[] as $x ({}; . * $x))'
      registration.json useredits.json

生成此输出:

[
    { "name": "User1", "editcount": 164, "registration": "2009-04-18T21:55:40Z" },
    { "name": "User2", "editcount": 150, "registration": "2010-11-17T15:09:43Z" },
    { "name": "User3", "editcount": 10 }
]

来源: jq - error when merging two JSON files "cannot be multiplied"

答案 2 :(得分:0)

以下假设你有jq 1.5或更高版本,并且:

  • joins.jq如下所示位于目录〜/ .jq /或目录〜/ .jq / joins /
  • pwd
  • 中没有名为joins.jq的文件
  • registration.json已被修复以使其有效JSON(顺便说一句,这可以由jq本身完成)。

使用的调用将是:

jq -s 'include "joins"; joins(.name)' registration.json useredits.json

joins.jq

# joins.jq Version 1 (12-12-2017)

def distinct(s):
  reduce s as $x ({}; .[$x | (type[0:1] + tostring)] = $x)
  |.[];

# Relational Join
# joins/6 provides similar functionality to the SQL INNER JOIN statement:
#   SELECT (Table1|p1), (Table2|p2)
#     FROM Table1
#     INNER JOIN Table2 ON (Table1|filter1) = (Table2|filter2)
# where filter1, filter2, p1 and p2 are filters.

# joins(s1; s2; filter1; filter2; p1; p2)
# s1 and s2 are streams of objects corresponding to rows in Table1 and Table2;
# filter1 and filter2 determine the join criteria;
# p1 and p2 are filters determining the final results.
# Input: ignored
# Output: a stream of distinct pairs [p1, p2]
# Note: items in s1 for which filter1 == null are ignored, otherwise all rows are considered.
#
def joins(s1; s2; filter1; filter2; p1; p2):
  def it: type[0:1] + tostring;
  def ix(s;f):
    reduce s as $x ({};  ($x|f) as $y | if $y == null then . else .[$y|it] += [$x] end);
  # combine two dictionaries using the cartesian product of distinct elements
  def merge:
    .[0] as $d1 | .[1] as $d2
    | ($d1|keys_unsorted[]) as $k
    | if $d2[$k] then distinct($d1[$k][]|p1) as $a | distinct($d2[$k][]|p2) as $b | [$a,$b]
      else empty end;

   [ix(s1; filter1), ix(s2; filter2)] | merge;

def joins(s1; s2; filter1; filter2):
  joins(s1; s2; filter1; filter2; .; .) | add ;

# Input: an array of two arrays of objects
# Output: a stream of the joined objects
def joins(filter1; filter2):
  joins(.[0][]; .[1][]; filter1; filter2);

# Input: an array of arrays of objects.
# Output: a stream of the joined objects where f defines the join criterion.
def joins(f):
  # j/0 is defined so TCO is applicable
  def j:
    if length < 2 then .[][]
    else [[ joins(.[0][]; .[1][]; f; f)]] + .[2:] | j
    end;
   j ;