我正在尝试将嵌套的JSON对象文件转换为CSV。 这是JSON的样本
{
"total_hosts" : [
{
"TYPE" : "AGENT",
"COUNT" : 6
}
],
"installed" : [
{
"ID" : "admin-4.0",
"VERSION" : 4,
"ADDON_NAME" : "Administration"
},
{
"ID" : "admin-2.0",
"VERSION" : 2,
"ADDON_NAME" : "Administration"
},
{
"ID" : "ch-5.0",
"VERSION" : "5",
"ADDON_NAME" : "Control Host"
}
],
"virtual_machine" : [
{
"COUNT" : 4,
"TYPE" : "VM"
}
TYPE,COUNT,ID,VERSION这些都是列,但问题不是每个对象都有值,有些有1个对象有这些值或有些有更多,我写入行,所以我想写空白空格,当该列没有值时。
将其写入CSV
的代码json_input = open('all.json')
try:
decoded = json.load(json_input)
# tell computer where to put CSV
outfile_path='Path to CSV'
# open it up, the w means we will write to it
writer = csv.writer(open(outfile_path,'w'))
for index in range(len(decoded['installed'])):
row = []
if decoded['total_hosts'][index]['TYPE'] is None:
row.append(str(''))
else:
row.append(str(decoded['total_hosts'][index]['TYPE']))
if decoded['total_hosts'][index]['COUNT'] is None:
row.append(str(''))
else:
row.append(str(decoded['total_hosts'][index]['COUNT']))
writer.writerow(row)
我收到Index out of range
错误,我甚至为True
尝试了False
/ if
条件。
任何人都可以帮我吗?
更新:预期输出:
TYPE,COUNT,ID,VERSION,ADDON_NAME,COUNT,TYPE
AGENT,6,admin-4.0,4,Administration,4,VM
, ,admin-2.0,2,Administration, ,
, ,cd-5.0,5,Control Host, ,
因此,当该列没有值时,我基本上需要空格。
Quesion已修改:输出:
AGENT,6,,,
, ,admin-4.0,4,Administration
, ,admin-2.0,2,Administration
, ,ch-5.0,5,Control Host
预期输出:
AGENT,6,admin-4.0,4,Administration
, ,admin-2.0,2,Administration
, ,ch-5.0,5,Control Host
更新:我甚至尝试过
row.append(str(entry.get('TYPE', '')))
row.append(str(entry.get('COUNT', '')))
row.append(str(entry.get('ID', '')))
row.append(str(entry.get('VERSION', '')))
row.append(str(entry.get('ADDON_NAME', '')))
writer.writerow(row)
仍然获得与上面相同的输出。 :(
答案 0 :(得分:2)
这里有两个错误:
您使用decoded['installed']
的长度生成一个索引,然后将其用于decoded['total_hosts']
列表。这将生成索引错误,因为decoded['total_hosts']
没有那么多条目。
访问不存在的密钥会抛出KeyError
;请改用dict.get()
method来检索值或默认值。
将直接循环到列表上要简单得多,不需要生成索引:
for host in decoded['total_hosts']:
row = [host.get('TYPE', ''), host.get('COUNT', '')]
writer.writerow(row)
您可以对此进行扩展以处理多个密钥:
for key in ('total_hosts', 'installed', 'virtual_machine'):
for entry in decoded[key]:
row = [entry.get('TYPE', ''), entry.get('COUNT', '')]
writer.writerow(row)
如果您需要合并两个条目的输出,请使用itertools.izip_longest()
配对列表,使用较短列表用完时的默认值:
from itertools import izip_longest
for t, i, v in izip_longest(decoded['total_hosts'], decoded['installed'], decoded['version'], fillvalue={}):
row = [t.get('TYPE', ''), t.get('COUNT', ''),
i('ID', ''), i('VERSION', ''), i.get('ADDON_NAME', ''),
v.get('COUNT', ''), v.get('TYPE', '')]
writer.writerow(row)
这允许三个列表中的任何一个比其他列表短。
对于2.6之前的Python版本(添加itertools.izip_longest
),您必须假设installed
总是最长,然后使用:
for i, installed in decoded['installed']:
t = decoded['types'][i] if i < len(decoded['types']) else {}
v = decoded['version'][i] if i < len(decoded['version']) else {}
row = [t.get('TYPE', ''), t.get('COUNT', ''),
installed['ID'], installed['VERSION'], installed['ADDON_NAME'],
v.get('COUNT', ''), v.get('TYPE', '')]
writer.writerow(row)