JSON至CSV Python脚本格式错误

时间:2019-03-14 20:34:54

标签: python json csv

我正在尝试将JSON文件转换为CSV文件,并且CSV文件的格式不正确。它可以正确创建列,但将其放置在列标题下方的文本却荒谬地隔开了。

例如,对于timestamp列,当它仅占用一列时,它将跨越JSON文件中其行内的5列的时间戳。当我编辑文件以仅将文本写入CSV文件时,就不存在此问题(即仅使用了一列)。这是一张照片: output of code

当脚本仅处理一个项目即可正常工作,然后在给出更多信息时出现错误时,是什么导致此问题的?

代码如下:

__author__ = 'seandolinar'

import json
import csv
import io

data_json = io.open('2018_to_2019-03-11.json', mode='r', encoding='utf-8').read()
data_python = json.loads(data_json)

csv_out = io.open('2018_to_2019-03-11.csv', mode='w', encoding='utf-8')


fields = u'timestamp, text, retweets, favorites' 
csv_out.write(fields)
csv_out.write(u'\n')

for line in data_python:

    row = [line.get('timestamp'),
        '"' + line.get('text').replace('"','""') + '"',
          line.get('retweets'),
          line.get('favorites')]

row_joined = u','.join(row)
csv_out.write(row_joined)
csv_out.write(u'\n')

csv_out.close()

这是我的JSON文件的一项:

{
"id": "1104890307706060802",
"timestamp": "4:42 PM - 10 Mar 2019",
"text": "“There’s not one shred of evidence that President Trump has done anything wrong.” @GrahamLedger One America News.  So true, a total Witch Hunt - All started illegally by Crooked Hillary Clinton, the DNC and others!",
"link": "https://twitter.com/realDonaldTrump/status/1104890307706060802",
"is_retweet": false,
"retweets": "19K",
"favorites": "76K",
"replies": "17K"
 },

1 个答案:

答案 0 :(得分:0)

您提供的代码采用了一条曲折的路线,这对我来说没有意义,并且没有利用它导入的csv模块。这是一种推测性的方法,可能会引发一些作者看到的但我们看不到的错误,但是我们可以从那里尝试工作。

import csv
import json


with open('example_input.json') as infile:
    input_data = json.load(infile)

# These 3 lines could be consolidated but I'm being explicit about building a
# nested list
output_headers = ['timestamp', 'text', 'retweets', 'favorites']

output_to_write = []

output_to_write.append(output_headers)

# Now iterate the JSON data and append rows as lists
for row in input_data:
    output_to_write.append([row.get('timestamp'),
                            row.get('text'),
                            row.get('retweets'),
                            row.get('favorites')])

with open('example_output.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    writer.writerows(output_to_write)