我有一个巨大的JSON文件,其中包含一些非常深的路径。我希望使用jq
来显示隐藏更深层内容的前N个键。然后,一旦我找到了我感兴趣的按键,继续向下钻取,只显示我从起点开始的N级,类似于文本编辑器折叠N级以下的所有内容。这可能吗?
答案 0 :(得分:1)
如果您对查看特定深度的对象感兴趣,可以使用getpath
和paths
。 paths
将返回图表中所有值的路径。您可以将这些路径过滤到特定长度的路径,然后使用getpath
获取相应的值。
getpath(paths | select(length == 3))
然后你可以随意过滤并缩小范围。
答案 1 :(得分:0)
Appended是一个jq模式推理程序,可用于理解大型JSON对象或JSON实体数组的结构,至少在它背后有一些押韵或原因时。
用法:如果感兴趣的JSON实体在文件input.json中,那么假设下面的程序在schema.jq中,运行:
jq -f schema.jq input.json
对于一个非常大的文件,模式推断可能会很慢,但通常使用这种方式比使用某种迭代方法更快。例如,请参阅下面给出的示例后面的评论。
这是一个使用JSON = JEOPARDY_QUESTIONS1.json的示例,一个54MB的文件(55554625字节) 可从https://raw.githubusercontent.com/alicemaz/super_jeopardy/master/JEOPARDY_QUESTIONS1.json
获取$ time jq -c -f schema.jq $JSON
[
{
"air_date": "string",
"answer": "string",
"category": "string",
"question": "string",
"round": "string",
"show_number": "string",
"value": "string"
}
]
real 0m12.868s
user 0m11.713s
sys 0m0.342s
u + s的时间值得注意,因为使用流解析器生成路径概要(参见本页的synopsis.jq),在同一台机器上的u + s时间约为三分之二。鉴于JSON文件是一个长度为216,930的数组,这可能是违反直觉的。
# Schema inference
# Version 0.1
# Author: pkoppstein at gmail dot com
# Requires: jq 1.4 or higher
# This module defines three filters:
# typeof/0 returns the extended-type of its input;
# typeUnion(a;b) returns the union of the two specified extended-type values;
# schema/0 returns the typeUnion of the extended-type values of the entities
# in the input array, if the input is an array,
# otherwise it simply returns the "typeof" value of its input.
# Each extended type can be thought of as a set of JSON entities,
# e.g. "number" for the set of JSON numbers, and ["number"] for the
# set of JSON number-valued arrays including [].
# The extended-type values are always JSON entities.
# The possible values are:
# "null", "boolean", "string", "number";
# "scalar" for any combination of non-null scalars;
# [T] where T is an extended type;
# an object all of whose values are extended types;
# "JSON" signifying that no other extended-type value is applicable.
# The extended-type values are defined recursively:
# The extended-type of a scalar value is its JSON type.
# The extended-type of a non-empty array of values all of which have the
# same JSON type, t, is [t], and similarly for ["scalar"], and ["JSON"].
# The extended-type of [] is ["null"], since that is the extended type of all arrays
# which have no elements other than null.
# The extended-type of an object is an object with the same keys, but the
# values of which are the extended-types of the corresponding values.
# typeUnion(a;b) returns the least extended-type value that subsumes both a and b.
# For example:
# typeUnion("number"; "string") yields "scalar";
# typeUnion({"a": "number"}; {"b": "string"}) yields {"a": "number", "b": "string"};
# typeUnion("null", t) yields t for any valid extended type, t.
def typeUnion(a;b):
def scalarp: . == "boolean" or . == "string" or . == "number" or . == "scalar";
a as $a | b as $b
| if $a == $b then $a
elif ($a | scalarp) and ($b | scalarp) then "scalar"
elif $a == "JSON" or $b == "JSON" then "JSON"
elif ($a|type) == "array" and ($b|type) == "array" then [ typeUnion($a[0]; $b[0]) ]
elif ($a|type) == "object" and ($b|type) == "object" then
((($a|keys) + ($b|keys)) | unique) as $keys
| reduce $keys[] as $key ( {} ; .[$key] = typeUnion( $a[$key]; $b[$key]) )
elif $a == "null" or $a == null then $b
elif $b == "null" or $b == null then $a
else "JSON"
end ;
def typeof:
def typeofArray:
if length == 0 then ["null"]
else [reduce .[] as $item (null; typeUnion(.; $item|typeof))]
end ;
def typeofObject:
reduce keys[] as $key (. ; .[$key] |= typeof) ;
. as $in
| type
| if . == "string" or . == "number" or . == "null" or . == "boolean" then .
elif . == "object" then $in | typeofObject
else $in | typeofArray
end ;
# Omit the outermost [] for an array
def schema:
if type == "array" then reduce .[] as $x ("null"; typeUnion(.; $x|typeof))
else typeof
end ;
# Example top-level:
schema
答案 2 :(得分:0)
这是一个过滤器,它发出所有路径的概要流 长度< =输入实体中的深度,除非深度< = 0, 深度限制被忽略。
路径[p1,p2,...]的概要是通过替换来构造的 使用"。[]"的整数组件,并使用"前缀字符串组件。", 所以例如,如果i和j是整数,那么 [i," keyname",j]将表示为。[] .keyname。[]
以下是使用jq -r
生成的输出示例:
.[]
.[].data
.[].data.children
.[].data.modhash
.[].kind
# If depth<0 then select paths of length equal to -depth
def paths_synopsis(depth):
[ paths
| if depth > 0 then select(length <= depth)
elif (depth < 0) then select(length == -depth)
else . end
| [.[]|if type=="number" then "[]" else . end]]
| unique
| .[]
| "." + join(".")
;
jq有一个流分析器,用于非常大的JSON实体。
以下过滤器适用于jq流解析器(jq --stream) 在管道中,其第二个组成部分统一了概要,如本例所示:
jq --arg depth 0 -c --stream -f synopsis.jq input.json | sort -u
在以下公式中,必须在命令行中指定所需的DEPTH限制。 指定0表示无限制。
synopsis.jq# Usage: jq --arg depth DEPTH -c --stream -f synopsis.jq input.json | sort -u
# or: jq --arg depth DEPTH -c --stream -f synopsis.jq input.json | jq -s -c unique[]
def synopsis(depth):
select(length == 2)
| .[0]
| if depth > 0 then select(length <= depth)
elif (depth < 0) then select(length == -depth)
else . end
| map( if type=="number" then [] else . end) ;
synopsis( $depth | if . then tonumber else 0 end )
curl -Ss 'http://forecast.weather.gov/MapClick.php?FcstType=json&lat=39.56&lon=-104.85' |
jq --arg depth 0 -c --stream -f synopsis.jq |
sort -u | head -n 50
["creationDate"]
["creationDateLocal"]
["credit"]
["currentobservation","Altimeter"]
["currentobservation","Date"]
["currentobservation","Dewp"]
["currentobservation","Gust"]
["currentobservation","Relh"]
["currentobservation","SLP"]
["currentobservation","Temp"]
["currentobservation","Visibility"]
["currentobservation","Weather"]
["currentobservation","Weatherimage"]
["currentobservation","WindChill"]
["currentobservation","Windd"]
["currentobservation","Winds"]
["currentobservation","elev"]
["currentobservation","id"]
["currentobservation","latitude"]
["currentobservation","longitude"]
["currentobservation","name"]
["currentobservation","state"]
["currentobservation","timezone"]
["data","hazard",[]]
["data","hazardUrl",[]]