如何使用python代码或库在csv中将json字符串转换为表格式?

时间:2017-06-19 08:30:31

标签: python json csv

我正在使用的MY json:

            {
       "lastMonth":{
     "aotalCost":4120.6241,
     "cpu_avg_partition":{
     "10":{
        "count":1,
        "cost":0.0,
        "value":10
     },
     "100":{
        "count":0,
        "cost":0.0,
        "value":100
     },
     "5":{
        "count":16,
        "cost":0.0,
        "value":5
     },
     "50":{
        "count":0,
        "cost":0.0,
        "value":50
     },
     "90":{
        "count":0,
        "cost":0.0,
        "value":90
     }
  },
  "providers":{
     "azure":{
        "cost":1844.134
     },
     "aws":{
        "cost":2276.4901
     }
  },
  "period":"lastMonth",
  "resourceTypes":{
     "db":{
        "cost":842.8003
     },
     "bucket":{
        "cost":362.2997
     },
     "server":{
        "cost":976.9376
     },
     "volume":{
        "cost":349.5705
     },
     "instance":{
        "cost":868.4003
     },
     "others":{
        "cost":199.1993
     },
     "null":{
        "cost":521.4163
     }
  },
  "accounts":{
     "188f6226-59d8-4a7c-9fdb-f32f4f0885ff":{
        "cost":19.3469
     },
     "19d2b0d2-d947-4e81-96aa-3b7f9583178a":{
        "cost":543.6707
     },
     "c658a818-b96e-48b4-8a2e-c77404c58af6":{
        "cost":1281.1164
     },
     "628455167342":{
        "cost":2276.4901
     }
  },
  "running_hours_partition":{
     "60":{
        "count":0,
        "cost":0.0,
        "value":60
     },
     "80":{
        "count":0,
        "cost":0.0,
        "value":80
     },
     "20":{
        "count":0,
        "cost":0.0,
        "value":20
     },
     "100":{
        "count":5,
        "cost":1008.16,
        "value":100
     },
     "40":{
        "count":0,
        "cost":0.0,
        "value":40
     }
  },
  "projects":{
     "813":{
        "cost":4120.6241
     }
    }
   },
 "currency_code":"GBP",
   "idmap":{
    "accounts":{
     "188f6226-59d8-4a7c-9fdb-f32f4f0885ff":"AzureForPM",
     "19d2b0d2-d947-4e81-96aa-3b7f9583178a":"AzureForPM",
     "c658a818-b96e-48b4-8a2e-c77404c58af6":"AzureForPM",
     "628455167342":"628455167342"
    },
  "projects":{
     "813":"ACP PM Work"
  }
  },
 "currentMonth":{
  "totalCost":1769.9801,
  "cpu_avg_partition":{
     "10":{
        "count":0,
        "cost":0.0,
        "value":10
     },
     "100":{
        "count":0,
        "cost":0.0,
        "value":100
     },
     "5":{
        "count":10,
        "cost":0.0,
        "value":5
     },
     "50":{
        "count":0,
        "cost":0.0,
        "value":50
     },
     "90":{
        "count":0,
        "cost":0.0,
        "value":90
     }
     },
    "providers":{
     "azure":{
        "cost":756.614
     },
     "aws":{
        "cost":1013.3661
     }
  },
  "period":"currentMonth",
  "resourceTypes":{
     "db":{
        "cost":376.3396
     },
     "bucket":{
        "cost":151.4477
     },
     "server":{
        "cost":389.8187
     },
     "volume":{
        "cost":152.1401
     },
     "instance":{
        "cost":369.6178
     },
     "others":{
        "cost":91.6273
     },
     "null":{
        "cost":238.9888
     }
  },
  "accounts":{
     "188f6226-59d8-4a7c-9fdb-f32f4f0885ff":{
        "cost":11.8057
     },
     "19d2b0d2-d947-4e81-96aa-3b7f9583178a":{
        "cost":224.0437
     },
     "c658a818-b96e-48b4-8a2e-c77404c58af6":{
        "cost":520.7645
     },
     "628455167342":{
        "cost":1013.3661
     }
  },
  "running_hours_partition":{
     "60":{
        "count":0,
        "cost":0.0,
        "value":60
     },
     "80":{
        "count":0,
        "cost":0.0,
        "value":80
     },
     "20":{
        "count":1,
        "cost":3.08,
        "value":20
     },
     "100":{
        "count":5,
        "cost":407.314,
        "value":100
     },
     "40":{
        "count":0,
        "cost":0.0,
        "value":40
     }
  },
  "projects":{
     "813":{
        "cost":1769.98
     }
  }

我的python代码如下所示,我试图获取csv文件

        import json
       from pandas.io.json import json_normalize


         def loading_file():
    #File path
      file_path = 

'C:/Python27/robotscripts/Analytics_new/MonthlycostCSV/monthly_costformatter_data.json'

  #Loading json file
  json_data = open(file_path)
   data = json.load(json_data)
   return data

   #Storing avaliable keys
  def data_keys(data):
   values = {}
   for i in data["lastMonth"]:
    for k in i.str():
        values[k] = 1

   values = values.values()

   #Excluding nested arrays from keys - hard coded -> IMPROVE
      new_keys = [x for x in values if
 x != 'lastMonth' and
    x != 'cpu_avg_partition']

return new_keys

        #Excluding nested arrays from json dictionary
    def new_data(data, values):
  new_data = []
  for i in range(0, len(data)):
    x = {k:v for (k,v) in data[i].items() if k in values }
    new_data.append(x)
return new_data

  def csv_out(data):
 data.to_csv('out.csv',encoding='utf-8')

    def main():
 data_file = loading_file()
 values = data_keys(data_file)
 table = new_data(data_file, values)
 csv_out(json_normalize(table))

    main()

我收到错误值

  

Traceback(最近一次调用最后一次):文件   “C:\ Python27 \ robotscripts \ Analytics_new \ MonthlycostC       main()文件“C:\ Python27 \ robotscripts \ Analytics_new \ MonthlycostC       values = data_keys(data_file)文件“C:\ Python27 \ robotscripts \ Analytics_new \ MonthlycostC       对于i.values()中的k:AttributeError:'unicode'对象没有属性'values'

我还尝试过其他程序:

            import json
             import csv

      f = open('monthly_costformatter_data.json')

      data = json.load(f)
     s=csv.writer(open('m1.csv','w'))

       s.writerow(["aotalCost","cpu_avg_partition","providers"])
       for i in data["lastMonth"]:
     s.writerow([i["aotalCost"],i["cpu_avg_partition"],i["providers"]])

错误:

  

C:\ Python27 \ robotscripts \ Analytics_new \ MonthlycostCSV> monthlycostcsv4.py   Traceback(最近一次调用最后一次):文件   “C:\ Python27 \ robotscripts \ Analytics_new \ MonthlycostCSV \ monthlycostcsv4.py”   第13行,在       s.writerow([i [“aotalCost”],i [“cpu_avg_partition”],i [“providers”]])TypeError:字符串索引必须是整数

1 个答案:

答案 0 :(得分:0)

对于第二个解决方案,data["lastMonth"]是一个字典,您可以直接使用键来查找值,不需要使用for loop,如下所示:

import json ,ast
import csv

f = open('monthly_costformatter_data.json')

data = json.load(f)
with open('m1.csv','w') as opfile:
    s=csv.writer(opfile)
    s.writerow(["aotalCost","cpu_avg_partition","providers"])
    lastmonth = data["lastMonth"]
    cpu_avg_partition = ast.literal_eval(json.dumps(lastmonth["cpu_avg_partition"]))
    providers =  ast.literal_eval(json.dumps(lastmonth["providers"]))
    write_list = [lastmonth["aotalCost"],cpu_avg_partition,providers]
    s.writerow(write_list)
f.close()

请注意,lastmonth["cpu_avg_partition"]lastmonth["providers"]也是字典。

如果你想加载其他字典,例如:currentMonthidmap,你可以直接使用加载的json数据,就像你对lastMonth所做的那样:

currentMonth= data["currentMonth"]
idmap= data["idmap"]

<强>更新

使用pandas,您可以将json文件加载到数据帧df,然后将df保存到csv文件中,如下所示:

import pandas as pd
df=pd.read_json("monthly_costformatter_data.json")
new_df = df.transpose()
column_name = new_df.columns.values
print column_name #print column name here
new_df[['totalCost','cpu_avg_partition']].to_csv('test.csv',index=False) #here you can select specific columns you want like 'totalCost','cpu_avg_partition' to write to csv file.