我需要获取以下Json输入的Exon中的值并将其拆分为";"并转换为嵌套的JSON,如下面的预期输出部分
所示 {
"regions":[
{
"metric":"GENE1",
"value":[
{
"metric":"Exons",
"value":[
"GENE1;chr1;45656;5656667"
],
"type":"set"
},
{
"metric":"Precent_no_call",
"value":4.22623,
"type":"simple"
},
{
"metric":"Total_NoCall_bases",
"value":112533,
"type":"simple"
}
],
"type":"metrics-set"
},
{
"metric":"GENE2",
"value":[
{
"metric":"Exons",
"value":[
"GENE2_Exon5;chr1;45656;5656667",
"GENE2_Exon10;chr1;45656;5656667"
],
"type":"set"
},
{
"metric":"Precent_no_call",
"value":0.746464,
"type":"simple"
},
{
"metric":"Total_NoCall_bases",
"value":16842,
"type":"simple"
}
],
"type":"metrics-set"
}
]
}
{
"regions":[
{
"metric":"GENE1",
"value":[
{
"metric":"Exons",
"value":[
"GENE1",
{
"chromosome":"chr1",
"start":45656,
"end":5656667
}
],
"type":"set"
},
{
"metric":"Precent_no_call",
"value":4.22623,
"type":"simple"
},
{
"metric":"Total_NoCall_bases",
"value":112533,
"type":"simple"
}
],
"type":"metrics-set"
},
{
"metric":"GENE2",
"value":[
{
"metric":"Exons",
"value":[
"GENE2_Exon5",
{
"chromosome":"chr1",
"start":45656,
"end":5656667
},
"GENE2_Exon10",
{
"chromosome":"chr1",
"start":45656,
"end":5656667
}
],
"type":"set"
},
{
"metric":"Precent_no_call",
"value":0.746464,
"type":"simple"
},
{
"metric":"Total_NoCall_bases",
"value":16842,
"type":"simple"
}
],
"type":"metrics-set"
}
]
}
此外,这与此处的问题有关: - Converting comma separated file to nested objects json in jq
提前感谢您的帮助。
def parse:
[
inputs # read lines
| split(",") # split into columns
| select(length>0) # eliminate blanks
| .[:1] + [.[1:-3]] + .[-3:] # normalize columns
]
;
def simple(n;v): {metric:n, value:v|tonumber, type:"simple"};
def set(n;v): {metric:n, value:v, type:"set"};
def chr(c;s;e): {chromsome:c, start:s, end:e};
def region:
set(.[0]; [
set("Exons"; (.[1] | tostring | split(";") |.[0]);
chr((.[1] | tostring | split(";") |.[1]),(.[1] | tostring | split(";") |.[2]),(.[1] | tostring | split(";") |.[3]))
]
),
simple("Fraction of bases"; .[5]),
simple("Total_bases"; .[6])
]
)
;
{
"Regions": parse | map(region)
}
我无法循环播放并递归读取。
答案 0 :(得分:1)
由于低级要求足够清晰,我已经组装了以下解决方案,其行为完全符合示例。但是,更高级别的要求相当粗略,因此您可能需要进行一些调整。
低级别要求(关于转换字符串)可以按如下方式实现:
# Input: a string
def gene2object:
split(";")
| [.[0], { chromosome: .[1],
start: (.[2]|tonumber),
end: (.[3]|tonumber)} ];
现在可以简单地编写一个解决方案,如下所示:
walk( if type == "object" and .metric == "Exons"
then .value |= (map(gene2object)|add)
else .
end )
标准调用(沿着jq -f program.jq input.json
行)产生的输出完全如上所述,所以我不会在这里重复它。
如果您的jq没有walk/1
,那么您可以从https://github.com/stedolan/jq/blob/master/src/builtin.jq获取其官方定义
也就是说,搜索:def walk