第一次发帖!我正在将JSON数据(字典)从服务器转换为csv文件。除了巢“宇航员”(一个阵列)之外,所采用的键和值都很好。基本上每个JSON字符串都是一个数据,可能包含从0到无限数量的宇航员,我希望将其作为独立值提取。比如像这样:
等等。这里的问题是嵌套被设置为数组而不是字典,所以我不知道该怎么做。我已经尝试过dpath库以及讨人喜欢的巢,但没有任何改变。有什么想法吗?
import json
import os
import csv
import datetime
import dpath.util #Dpath library needs to be installed first
datum = {"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}
#Parsing process
parsed = json.loads(datum) #datum is the JSON string retrieved from the server
def flattenjson(parsed, delim):
val = {}
for i in parsed.keys():
if isinstance(parsed[i], dict):
get = flattenjson(parsed[i], delim)
for j in get.keys():
val[i + delim + j] = get[j]
else:
val[i] = parsed[i]
return val
flattened = flattenjson(parsed,"__")
#process of creating csv file
keys=['Astronaut1_Spaceship_First','Astronaut2_Spaceship_Second', 'Astronaut1_Name] #reduced to 3 keys for this example
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
writer.writerow(flattened)
#JSON DATA FROM SERVER
{
"Mission": "Make Earth Greater Again",
"Objective": "Prove Earth is flat",
"Astronauts": [ {
"Spaceships": {
"First": "Katabom",
"Second": "The Kraken"
},
"Name": "Jebeddiah",
"Gender": "Hopefully male",
"Age": 35,
"Prefered colleages": [],
"Following missions": [
{
"Payment_status": "TO BE CONFIRMED"
}
]
},
{
"Spaceships": {
"First": "The Kraken",
"Second": "Minnus I"
},
"Name": "Bob",
"Gender": "Hopefully female",
"Age": 23,
"Prefered colleages": [],
"Following missions": [
{
"Payment_status": "TO BE CONFIRMED"
}
]
},
]
}
]
答案 0 :(得分:0)
首先,您在此处定义的数据不是从服务器中提取的数据。来自服务器的数据将是一个字符串。您已在此程序中处理的数据已被处理。现在,假设数据为:
datum = '{"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}'
您不需要dpath库。这里的问题是你的json flattener不处理嵌入式列表。尝试使用我下面提到的那个。 假设你想要一行csv文件,
import json
def flattenjson(data, delim, topname=''):
"""JSON flattener that can handle embedded lists and dictionaries"""
flattened = {}
def internalflat(int_data, name=topname):
if type(int_data) is dict:
for key in int_data:
internalflat(int_data[key], name + key + delim)
elif type(int_data) is list:
i = 1
for elem in int_data:
internalflat(elem, name + str(i) + delim)
i += 1
else:
flattened[name[:-len(delim)]] = int_data
internalflat(data)
return flattened
#If you don't want mission or objective in csv file
flattened_astronauts = flattenjson(json.loads(datum)["Astronauts"], "__", "Astronaut")
keys = flattened_astronauts.keys().sort()
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
writer.writerow(flattened_astronauts)