我正在查询公司内部的API以进行反腐败调查,并且我得到了嵌套JSON的结果,可以在here中看到。我想将此字典转换为简单的{key:value, key:value}
格式,如果我有嵌套的对象或列表,则其中的键合并在扁平化的键字符串中。
问题还在于,API返回的某些项目可能不一定具有全部的key:value对,因为其中一些是可选的。 如果没有key:value对,那么我想插入一个NA
。
这是最完整的JSON-一些查询结果可能没有所有这些条目。
{
"items" : [
{
"address" : {
"address_line_1" : "string",
"address_line_2" : "string",
"care_of" : "string",
"country" : "string",
"locality" : "string",
"po_box" : "string",
"postal_code" : "string",
"premises" : "string",
"region" : "string"
},
"address_snippet" : "string",
"appointment_count" : "integer",
"date_of_birth" : {
"month" : "integer",
"year" : "integer"
},
"description" : "string",
"description_identifiers" : [
"integer"
],
"kind" : "string",
"links" : {
"self" : "string"
},
"matches" : [
{
"address_snippet" : [
"integer"
],
"snippet" : [
"integer"
],
"title" : [
"integer"
]
}
],
"snippet" : "string",
"title" : "string"
}
],
"items_per_page" : "integer",
"kind" : "string",
"start_index" : "integer",
"total_results" : "integer"
}
重用一些旧的JQ代码,我设法创建了两个列表,一个包含所有键,一个包含所有值(请参阅jqplay here)。
这里是仅一小部分字典的示例,以使您了解:
{
"items_address_address_line_1" : "string",
"items_address_address_line_2" : "string"
"items_address_care_of" : "string",
"items_address_country" : "string",
"items_address_locality" : "string",
"items_address_po_box" : "string",
"items_address_postal_code" : "string",
"items_address_premises" : "string",
"items_address_region" : "string"
}
答案 0 :(得分:0)
假设items
数组始终只有一个元素,请使用--stream
选项;
reduce (inputs|select(length == 2)) as $p
({}; .[$p[0]|map(strings)|join("_")] = $p[1])
由于使用了inputs
,因此还需要-n
选项。
答案 1 :(得分:0)
您可以使用pandas,特别是json_normalize
from pandas.io.json import json_normalize
d = {
"items" : [
{
"address" : {
"address_line_1" : "string",
"address_line_2" : "string",
"care_of" : "string",
"country" : "string",
"locality" : "string",
"po_box" : "string",
"postal_code" : "string",
"premises" : "string",
"region" : "string"
},
"address_snippet" : "string",
"appointment_count" : "integer",
"date_of_birth" : {
"month" : "integer",
"year" : "integer"
},
"description" : "string",
"description_identifiers" : [
"integer"
],
"kind" : "string",
"links" : {
"self" : "string"
},
"matches" : [
{
"address_snippet" : [
"integer"
],
"snippet" : [
"integer"
],
"title" : [
"integer"
]
}
],
"snippet" : "string",
"title" : "string"
}
],
"items_per_page" : "integer",
"kind" : "string",
"start_index" : "integer",
"total_results" : "integer"
}
x = json_normalize(d['items'], sep="_")
print(x.to_string())
# print(x.keys()) # handy, as you may get "lost" with many keys
# x.to_dict(
address_address_line_1 address_address_line_2 address_care_of address_country address_locality address_po_box address_postal_code address_premises address_region address_snippet appointment_count date_of_birth_month date_of_birth_year description description_identifiers kind links_self matches snippet title
0 string string string string string string string string string string integer integer integer string [integer] string string [{'address_snippet': ['integer'], 'snippet': [... string string
注意:
json_normalize
来展平嵌套的元素(列表)。master_df
,并将所有keys
展平。希望对您有意义,否则请发表评论。