python转换问题中的JSON到CSV

时间:2014-04-14 14:31:02

标签: python arrays json object csv

我正在尝试将嵌套的JSON对象文件转换为CSV。 这是JSON的样本

{
   "total_hosts" : [
      {
         "TYPE" : "AGENT",
         "COUNT" : 6
      }
   ],
   "installed" : [
      {
         "ID" : "admin-4.0",
         "VERSION" : 4,
         "ADDON_NAME" : "Administration"
      },
      {
         "ID" : "admin-2.0",
         "VERSION" : 2,
         "ADDON_NAME" : "Administration"
      },
      {
         "ID" : "ch-5.0",
         "VERSION" : "5",
         "ADDON_NAME" : "Control Host"
      }
   ],
   "virtual_machine" : [
      {
         "COUNT" : 4,
         "TYPE" : "VM"
      }

TYPE,COUNT,ID,VERSION这些都是列,但问题不是每个对象都有值,有些有1个对象有这些值或有些有更多,我写入行,所以我想写空白空格,当该列没有值时。

将其写入CSV

的代码
json_input = open('all.json')
try:
    decoded = json.load(json_input)
# tell computer where to put CSV
    outfile_path='Path to CSV'
# open it up, the w means we will write to it
    writer = csv.writer(open(outfile_path,'w'))


       for index in range(len(decoded['installed'])):
            row = []

            if decoded['total_hosts'][index]['TYPE'] is None:
                row.append(str(''))
            else:
                row.append(str(decoded['total_hosts'][index]['TYPE']))
            if decoded['total_hosts'][index]['COUNT'] is None:
                row.append(str(''))
            else:
                row.append(str(decoded['total_hosts'][index]['COUNT']))

            writer.writerow(row)

我收到Index out of range错误,我甚至为True尝试了False / if条件。

任何人都可以帮我吗?

更新:预期输出:

TYPE,COUNT,ID,VERSION,ADDON_NAME,COUNT,TYPE
AGENT,6,admin-4.0,4,Administration,4,VM
 , ,admin-2.0,2,Administration, , 
 , ,cd-5.0,5,Control Host, , 

因此,当该列没有值时,我基本上需要空格。

Quesion已修改:输出:

AGENT,6,,,
 , ,admin-4.0,4,Administration
 , ,admin-2.0,2,Administration
 , ,ch-5.0,5,Control Host

预期输出:

AGENT,6,admin-4.0,4,Administration
 , ,admin-2.0,2,Administration
 , ,ch-5.0,5,Control Host

更新:我甚至尝试过

            row.append(str(entry.get('TYPE', '')))
            row.append(str(entry.get('COUNT', '')))
            row.append(str(entry.get('ID', '')))
            row.append(str(entry.get('VERSION', '')))
            row.append(str(entry.get('ADDON_NAME', '')))
            writer.writerow(row)

仍然获得与上面相同的输出。 :(

1 个答案:

答案 0 :(得分:2)

这里有两个错误:

  1. 您使用decoded['installed']的长度生成一个索引,然后将其用于decoded['total_hosts']列表。这将生成索引错误,因为decoded['total_hosts']没有那么多条目。

  2. 访问不存在的密钥会抛出KeyError;请改用dict.get() method来检索值或默认值。

  3. 直接循环到列表上要简单得多,不需要生成索引:

    for host in decoded['total_hosts']:
        row = [host.get('TYPE', ''), host.get('COUNT', '')]
        writer.writerow(row)
    

    您可以对此进行扩展以处理多个密钥:

    for key in ('total_hosts', 'installed', 'virtual_machine'):
        for entry in decoded[key]:
            row = [entry.get('TYPE', ''), entry.get('COUNT', '')]
            writer.writerow(row)
    

    如果您需要合并两个条目的输出,请使用itertools.izip_longest()配对列表,使用较短列表用完时的默认值:

    from itertools import izip_longest
    
    for t, i, v in izip_longest(decoded['total_hosts'], decoded['installed'], decoded['version'], fillvalue={}):
        row = [t.get('TYPE', ''), t.get('COUNT', ''), 
               i('ID', ''), i('VERSION', ''), i.get('ADDON_NAME', ''),
               v.get('COUNT', ''), v.get('TYPE', '')]
        writer.writerow(row)
    

    这允许三个列表中的任何一个比其他列表短。

    对于2.6之前的Python版本(添加itertools.izip_longest),您必须假设installed总是最长,然后使用:

    for i, installed in decoded['installed']:
        t = decoded['types'][i] if i < len(decoded['types']) else {}
        v = decoded['version'][i] if i < len(decoded['version']) else {}
        row = [t.get('TYPE', ''), t.get('COUNT', ''), 
               installed['ID'], installed['VERSION'], installed['ADDON_NAME'],
               v.get('COUNT', ''), v.get('TYPE', '')]
        writer.writerow(row)