在python中将JSON转换为CSV

时间:2015-05-28 07:09:04

标签: python json csv

我有大量的Json数据,我想将它们放在数据框或csv文件中,以便我可以详细分析它们。我尝试在R中使用rjson包,但它无法解决大量数据。我正在使用以下代码在python中尝试 -

import csv
import json

with open('sample.json') as f:
    for line in f:
        x = json.loads(line)

        f = csv.writer(open("sample.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
        f.writerow(["asin", "title", "price", "imUrl", "also_bought", "also_viewed", "bought_together", "sales_rank", "brand", "categories"])

        for x in x.iteritems():
                print x
                f.writerow([x["asin"],
                x["title"],
                x["price"],
                x["imUrl"],
                x["related"]["also_bought"],
                x["related"]["also_viewed"],
                x["related"]["bought_together"],
                x["sales_rank"],
                x["brand"],
                x["categories"]])

但是我收到以下错误 -

Traceback (most recent call last):
  File "Sample_CSV.py", line 14, in <module>
    csv_file.writerow([item['pk'], item['model']] + item['fields'].values())
TypeError: string indices must be integers

请帮帮我,我有GB的数据,我是Python的新手。非常感谢您的帮助。谢谢!

我的示例数据看起来像这样 - 它是一堆亚马逊元数据 -

{
  "asin": "0000031852",
  "title": "Girls Ballet Tutu Zebra Hot Pink",
  "price": 3.17,
  "imUrl": "http://ecx.images-amazon.com/images/I/51fAmVkTbyL._SY300_.jpg",
  "related":
  {
    "also_bought": ["B00JHONN1S", "B002BZX8Z6", "B00D2K1M3O", "0000031909", "B00613WDTQ", "B00D0WDS9A", "B00D0GCI8S", "0000031895", "B003AVKOP2", "B003AVEU6G", "B003IEDM9Q", "B002R0FA24", "B00D23MC6W", "B00D2K0PA0", "B00538F5OK", "B00CEV86I6", "B002R0FABA", "B00D10CLVW", "B003AVNY6I", "B002GZGI4E", "B001T9NUFS", "B002R0F7FE", "B00E1YRI4C", "B008UBQZKU", "B00D103F8U", "B007R2RM8W"],
    "also_viewed": ["B002BZX8Z6", "B00JHONN1S", "B008F0SU0Y", "B00D23MC6W", "B00AFDOPDA", "B00E1YRI4C", "B002GZGI4E", "B003AVKOP2", "B00D9C1WBM", "B00CEV8366", "B00CEUX0D8", "B0079ME3KU", "B00CEUWY8K", "B004FOEEHC", "0000031895", "B00BC4GY9Y", "B003XRKA7A", "B00K18LKX2", "B00EM7KAG6", "B00AMQ17JA", "B00D9C32NI", "B002C3Y6WG", "B00JLL4L5Y", "B003AVNY6I", "B008UBQZKU", "B00D0WDS9A", "B00613WDTQ", "B00538F5OK", "B005C4Y4F6", "B004LHZ1NY", "B00CPHX76U", "B00CEUWUZC", "B00IJVASUE", "B00GOR07RE", "B00J2GTM0W", "B00JHNSNSM", "B003IEDM9Q", "B00CYBU84G", "B008VV8NSQ", "B00CYBULSO", "B00I2UHSZA", "B005F50FXC", "B007LCQI3S", "B00DP68AVW", "B009RXWNSI", "B003AVEU6G", "B00HSOJB9M", "B00EHAGZNA", "B0046W9T8C", "B00E79VW6Q", "B00D10CLVW", "B00B0AVO54", "B00E95LC8Q", "B00GOR92SO", "B007ZN5Y56", "B00AL2569W", "B00B608000", "B008F0SMUC", "B00BFXLZ8M"],
    "bought_together": ["B002BZX8Z6"]
  },
  "salesRank": {"Toys & Games": 211836},
  "brand": "Coxlures",
  "categories": [["Sports & Outdoors", "Other Sports", "Dance"]]
}

where

asin - ID of the product, e.g. 0000031852
title - name of the product
price - price in US dollars (at time of crawl)
imUrl - url of the product image
related - related products (also bought, also viewed, bought together, buy after viewing)
salesRank - sales rank information
brand - brand name
categories - list of categories the product belongs to

0 个答案:

没有答案