嵌套JSON到CSV转换

时间:2016-01-15 10:37:53

标签: python json api csv

我从REST API收到的JSON响应很少,我的回复格式如下: -

{
 "headings": [
   "ACCOUNT_ID",
   "date",
   "FB Likes"
 ],
 "rows": [
   [
    "My Account",
    "1435708800000",
     117
   ],
   [
   "My Account",
   "1435795200000",
   99
   ],
   [
   "My Account",
   "1435708800000",
   7
  ]
]
}

如果列是AccountID,Date和FB_Likes,并且我试图将其转换为csv,我尝试了许多不同的迭代但没有成功。

请帮我解决这个问题

我用过的一个脚本是

with open('Account_Insights_12Jan.json') as fi:
data = json.load(fi)

 json_array=data

columns = set()
for item in json_array:
   columns.update(set(item))

# writing the data on csv
with open('Test_14Jan.csv', 'w', newline='') as fo:
writer = csv.writer(fo)

writer.writerow(list(columns))
for item in json_array:
    row = []
    for c in columns:
        if c in item: row.append(str(item[c]))
        else: row.append('')
    writer.writerow(row)

N我收到错误,我从某个地方复制了它,请解释如何转换它

嗨再次

{
 "headings": [
"POST_ ID",
"POST_COMMENT_COUNT"
 ],
 "rows": [
 [
  {
    "postId": 188365573,
    "messageId": 198365562,
    "accountId": 214,
    "messageType": 2,
    "channelType": "TWITTER",
    "accountType": "TWITTER",
    "taxonomy": {
      "campaignId": "2521_4",
      "clientCustomProperties": {
        "PromotionChannelAbbreviation": [
          "3tw"
        ],
        "PromotionChannels": [
          "Twitter"
        ],
        "ContentOwner": [
          "Audit"
        ],
        "Location": [
          "us"
        ],
        "Sub_Category": [
          "dbriefs"
        ],
        "ContentOwnerAbbreviation": [
          "aud"
        ],
        "PrimaryPurpose_Outcome": [
          "Engagement"
        ],
        "PrimaryPurposeOutcomeAbbv": [
          "eng"
        ]
      },
      "partnerCustomProperties": {},
      "tags": [],
      "urlShortnerDomain": "2721_spr.ly"
    },
    "approval": {
      "approvalOption": "NONE",
      "comment": ""
    },
    "status": "SENT",
    "createdDate": 1433331585000,
    "scheduleDate": 1435783440000,
    "version": 4,
    "deleted": false,
    "publishedDate": 1435783441000,
    "statusID": "6163465412728176",
    "permalink": "https://twitter.com/Acctg/status/916346541272498176",
    "additional": {
      "links": []
    }
  },
  0
],
[
  {
    "postId": 999145171,
    "messageId": 109145169,
    "accountId": 21388,
    "messageType": 2,
    "channelType": "TWITTER",
    "accountType": "TWITTER",
    "taxonomy": {
      "campaignId": "2521_4",
      "clientCustomProperties": {
        "PromotionChannelAbbreviation": [
          "3tw"
        ],
        "Eminence_Registry_Number": [
          "1000159"
        ],
        "PromotionChannels": [
          "Twitter"
        ],
        "ContentOwner": [
          "Ctr. Health Solutions"
        ],
        "Location": [
          "us"
        ],
        "Sub_Category": [
          "fraud"
        ],
        "ContentOwnerAbbreviation": [
          "chs"
        ],
        "PrimaryPurpose_Outcome": [
          "Awareness"
        ],
        "PrimaryPurposeOutcomeAbbv": [
          "awa"
        ]
      },
      "partnerCustomProperties": {},
      "tags": [],
      "urlShortnerDomain": "2521_spr.ly"
    },
    "approval": {
      "approvalOption": "NONE",
      "comment": ""
    },
    "status": "SENT",
    "createdDate": 1434983660000,
    "scheduleDate": 1435753800000,
    "version": 4,
    "deleted": false,
    "publishedDate": 1435753801000,
    "statusID": "616222222198407168",
    "permalink": "https://twitter.com/Health/status/6162222221984070968",
    "additional": {
      "links": []
    }
  },
  0
]   
}

请同时考虑此JSON响应 再次感谢所有的帮助,你是一个救世主!

响应将如下所示。它是一个示例输出,因为有很多列,我包括很少的列。我的不好,我不知道如何分享excel输出

帖子ID,MessageID,AccountID,messageType,accountType,频道类型
188365573,198365562,214,2,微博,微博

999145171,109145169,21388,2,微博,微博

正在处理的代码是

csvdata= open('Data_table2.csv', 'w')
csvwriter = csv.writer(csvdata, delimiter=',')
csvwriter.writerow(header)


for i in range(0,70):
  csvwriter.writerow(data1["rows"][i][0].values())

csvdata.close()

但是没有成功运行,因为有许多嵌套版本,而且在一些响应中我们有一些需要检查的标题,如果它不存在,那么为它创建一个新的标题

再次感谢所有帮助! 马努

2 个答案:

答案 0 :(得分:1)

首先,安装pandas:

pip install pandas

然后,使用pandas使用从响应中获取的数据创建DataFrame对象。创建对象后,您将能够将其转换为csv或xls文件,设置'index = False'以防止将索引添加到输出文件中。

import pandas as pd
import json

with open('data_new.json') as fi:
    data = json.load(fi)
    df = pd.DataFrame(data=data['rows'],columns=data['headings'])
    df.to_csv('data_table.csv', index=False)

输出示例:

ACCOUNT_ID,date,FB Likes
My Account,1435708800000,117
My Account,1435795200000,99
My Account,1435708800000,7

答案 1 :(得分:0)

错过了python的要求,但是如果你愿意打电话给外部程序,这仍然有效。 请注意,这需要jq> = 1.5才能生效。

cat YourJsonFile | jq -r ' [ .rows[][0] | to_entries  | map(.key), map(.value | tostring) ] |  .[0,range(1;length;2)]|@csv'

# Lets break it down
jq -r  #  disable escaping and quoting
   ' [ # this will add create an array
      .rows[][0] # select rows (from object, [] it's array, 
                 # and [0] first element in that array)
      | to_entries # convert it to key, value object
      | map(.key), map(.value | tostring) # select key and value 
                            # (value is converted to string)
                            # this is the step that needs '-r' option to jq
      ] # close array. We now have alternating "header" and "data" rows
      |  .[0,range(1;length;2)] # select from current (.), first (0) and 
                                # with range function every second row
                                # starting from one
      |@csv # convert resulting json to csv
      '      #  Done

https://stedolan.github.io/jq/