我有json(实际上以csv开头)形式的元素数组的形式:
{
"field1" : "value1"
"field2.1; Field2.2 Field2.3" : "Field2.1Value0; Field2.2Value0; Field2.3Value0; Field2.1Value1; Field2.2Value1; Field2.3Value1; ..."
}
...
我想迭代字段的字符串" field2.1; Field2.2 Field2.3",three&#34 ;;"一次分隔项目以生成一组键值对
{
"field1" : "value1"
"newfield" : [
{ "Field2.1": "Field2.1Value0",
"Field2.2": "Field2.2Value0",
"Field2.3": "Field2.1Value0" },
{ "Field2.1": "Field2.1Value1",
"Field2.2": "Field2.2Value1",
"Field2.3": "Field2.3Value1"},
...
]
}
...
请注意,实际上有几个键需要像这样扩展。每个都有可变数量的"子键"。
换句话说,原始CSV文件包含一些列,这些列表示以分号分隔的字段值的元组。
我知道如何进入" field2.1; Field2.2 Field2.3"并说它在";"但后来我一直试图迭代那3个(或多个)项目来产生单独的3个元组。
真实世界的示例/上下文是从Google Play商店导出目录的CSV格式。
例如,Field2.1是Locale,Field2.2是Title,Field3.3是Description:
jq '."Locale; Title; Description" |= split(";") '
如果可能的话,如果迭代是基于分号分隔的子场数和#34;那么这将是很好的。在关键值。还有另一列在每个国家/地区都有类似的价格格式。
答案 0 :(得分:1)
以下假设splits/1
可用于根据正则表达式拆分字符串。如果你的jq没有它,如果你不能或不想升级,你可以使用split/1
设计一个解决方法,它只适用于字符串。
首先,让我们从一个简单的问题变体开始,不需要回收标题。如果以下jq程序在文件中(比如program.jq):
# Assuming header is an array of strings,
# create an object from an array of values:
def objectify(headers):
. as $in
| reduce range(0; headers|length) as $i ({}; .[headers[$i]] = ($in[$i]) );
# From an object of the form {key: _, value: _},
# construct an object by splitting each _
def devolve:
if .key|index(";")
then .key as $key
| ( [.value | splits("; *")] ) | objectify([$key | splits("; *")])
else { (.key): .value }
end;
to_entries | map( devolve )
如果以下JSON在input.json中:
{
"field1" : "value1",
"field2.1; Field2.2; Field2.3" : "Field2.1Value0; Field2.2Value0; Field2.3Value0"
}
然后调用:
jq -f program.jq input.json
应该产生:
[
{
"field1": "value1"
},
{
"field2.1": "Field2.1Value0",
"Field2.2": "Field2.2Value0",
"Field2.3": "Field2.3Value0"
}
]
添加一些错误检查或纠错代码可能是有意义的。
现在让我们修改上述内容,以便根据问题陈述回收标题。
def objectifyRows(headers):
(headers|length) as $m
| (length / $m) as $n
| . as $in
| reduce range(0; $n) as $i ( [];
.[$i] = (reduce range(0; $m) as $h ({};
.[headers[$h]] = $in[($i * $m) + $h] ) ) );
def devolveRows:
if .key|index(";")
then .key as $key
| ( [.value | splits("; *")] )
| objectifyRows([$key | splits("; *")])
else { (.key): .value }
end;
to_entries | map( devolveRows )
输入:
{
"field1" : "value1",
"field2.1; Field2.2; Field2.3" :
"Field2.1Value0; Field2.2Value0; Field2.3Value0; Field2.4Value0; Field2.5Value0; Field2.6Value0"
}
输出将是:
[
{
"field1": "value1"
},
[
{
"field2.1": "Field2.1Value0",
"Field2.2": "Field2.2Value0",
"Field2.3": "Field2.3Value0"
},
{
"field2.1": "Field2.4Value0",
"Field2.2": "Field2.5Value0",
"Field2.3": "Field2.6Value0"
}
]
]
现在可以根据OP建议的线路轻松调整此输出,例如:为了引入一个新的密钥,可以将上述内容输入:
.[0] + { newfield: .[1] }
以下是objectify
和objectifyRows
的无减少但有效(假设jq> = 1.5)的实现:
def objectify(headers):
[headers, .] | transpose | map( {(.[0]): .[1]} ) | add;
def objectifyRows(headers):
def gather(n):
def g: if length>0 then .[0:n], (.[n:] | g ) else empty end;
g;
[gather(headers|length) | objectify(headers)] ;
答案 1 :(得分:0)
这是我几乎最终的解决方案,它插入新密钥以及使用“;”的第一个元素列表作为排序数组的关键。
def objectifyRows(headers):
(headers|length) as $m
| (headers[0]) as $firstkey
| (length / $m) as $n
| . as $in
| reduce range(0; $n) as $i ( [];
.[$i] = (reduce range(0; $m) as $h ({};
.[headers[$h]] = $in[($i * $m) + $h] ) ) )
;
def devolveRows:
if .key|index(";")
then .key as $multikey
| ( [.value | splits("; *")] )
# Create a new key with value being an array of the "splits"
| { ($multikey): objectifyRows([$multikey | splits("; *")])}
# here "arbitrarily" sort by the first split key
| .[$multikey] |= sort_by(.[[$multikey | splits("; *")][0]])
else { (.key): .value }
end;
to_entries | map( devolveRows )