如何将CSV文件转换为多线JSON?

时间:2013-10-31 03:15:16

标签: python json csv

这是我的代码,非常简单的东西......

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)

声明一些字段名称,读者使用CSV读取文件,并使用字段名称将文件转储为JSON格式。这是问题......

CSV文件中的每条记录都在不同的行上。我希望JSON输出方式相同。问题是它将它全部放在一条巨大的长线上。

我尝试使用类似for line in csvfile:之类的内容,然后在reader = csv.DictReader( line, fieldnames)之下运行我的代码,循环遍历每一行,但它在一行上执行整个文件,然后遍历整个文件在另一条线上......继续直到它用完线。

有任何纠正此事的建议吗?

编辑:澄清一下,目前我有:(第1行的每条记录)

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]

我正在寻找:( 2行2条记录)

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}

并非每个单独的字段缩进/在单独的行上,但每条字段都在其自己的行上。

一些示例输入。

"John","Doe","001","Message1"
"George","Washington","002","Message2"

12 个答案:

答案 0 :(得分:113)

您想要的输出问题是它不是有效的json文档,;它是一个 json文档流

没关系,如果你需要它,但这意味着对于输出中你想要的每个文档,你必须打电话给json.dumps

由于您想要分隔文档的换行符不包含在这些文档中,因此您可以自行提供。所以我们只需要调用json.dump的循环,并为每个写入的文档设置换行符。

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

答案 1 :(得分:10)

您可以使用Pandas DataFrame实现此目的,使用以下示例:

import pandas as pd
csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False))
csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None)

答案 2 :(得分:8)

我采用了@ SingleNegationElimination的响应并将其简化为可以在管道中使用的三线程:

import csv
import json
import sys

for row in csv.DictReader(sys.stdin):
    json.dump(row, sys.stdout)
    sys.stdout.write('\n')

答案 3 :(得分:6)

您可以尝试this

import csvmapper

# how does the object look
mapper = csvmapper.DictMapper([ 
  [ 
     { 'name' : 'FirstName'},
     { 'name' : 'LastName' },
     { 'name' : 'IDNumber', 'type':'int' },
     { 'name' : 'Messages' }
  ]
 ])

# parser instance
parser = csvmapper.CSVParser('sample.csv', mapper)
# conversion service
converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

编辑:

更简单的方法

import csvmapper

fields = ('FirstName', 'LastName', 'IDNumber', 'Messages')
parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields))

converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

答案 4 :(得分:2)

indent参数添加到json.dumps

 data = {'this': ['has', 'some', 'things'],
         'in': {'it': 'with', 'some': 'more'}}
 print(json.dumps(data, indent=4))

另请注意,您只需将json.dump与开放jsonfile

一起使用即可
json.dump(data, jsonfile)

答案 5 :(得分:2)

import csv
import json

file = 'csv_file_name.csv'
json_file = 'output_file_name.json'

#Read CSV File
def read_CSV(file, json_file):
    csv_rows = []
    with open(file) as csvfile:
        reader = csv.DictReader(csvfile)
        field = reader.fieldnames
        for row in reader:
            csv_rows.extend([{field[i]:row[field[i]] for i in range(len(field))}])
        convert_write_json(csv_rows, json_file)

#Convert csv data into json
def convert_write_json(data, json_file):
    with open(json_file, "w") as f:
        f.write(json.dumps(data, sort_keys=False, indent=4, separators=(',', ': '))) #for pretty
        f.write(json.dumps(data))


read_CSV(file,json_file)

Documentation of json.dumps()

答案 6 :(得分:1)

如何使用Pandas将csv文件读入DataFrame(pd.read_csv),然后根据需要操作列(删除它们或更新值),最后将DataFrame转换回JSON({{3} })。

注意:我没有检查过它的效率,但这绝对是操纵大型csv并将其转换为json的最简单方法之一。

答案 7 :(得分:1)

我看到这是旧的,但我需要SingleNegationElimination的代码,但是我遇到了包含非utf-8字符的数据的问题。这些出现在我并不过分关注的领域,所以我选择忽略它们。然而,这需要一些努力。我是python的新手,所以经过一些试验和错误,我得到了它的工作。该代码是SingleNegationElimination的副本,具有utf-8的额外处理。我尝试用https://docs.python.org/2.7/library/csv.html来做,但最终放弃了。以下代码有效。

import csv, json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner")
reader = csv.DictReader(csvfile , fieldnames)

code = ''
for row in reader:
    try:
        print('+' + row['Code'])
        for key in row:
            row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8')      
        json.dump(row, jsonfile)
        jsonfile.write('\n')
    except:
        print('-' + row['Code'])
        raise

答案 8 :(得分:0)

对@MONTYHS的回答略有改进,迭代一串字段名:

import csv
import json

csvfilename = 'filename.csv'
jsonfilename = csvfilename.split('.')[0] + '.json'
csvfile = open(csvfilename, 'r')
jsonfile = open(jsonfilename, 'w')
reader = csv.DictReader(csvfile)

fieldnames = ('FirstName', 'LastName', 'IDNumber', 'Message')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    row[field] = each[field]
output.append(row)

json.dump(output, jsonfile, indent=2, sort_keys=True)

答案 9 :(得分:0)

def read():
    noOfElem = 200  # no of data you want to import
    csv_file_name = "hashtag_donaldtrump.csv"  # csv file name
    json_file_name = "hashtag_donaldtrump.json"  # json file name

    with open(csv_file_name, mode='r') as csv_file:
        csv_reader = csv.DictReader(csv_file)
        with open(json_file_name, 'w') as json_file:
            i = 0
            json_file.write("[")
            
            for row in csv_reader:
                i = i + 1
                if i == noOfElem:
                    json_file.write("]")
                    return

                json_file.write(json.dumps(row))

                if i != noOfElem - 1:
                    json_file.write(",")

改变上面三个参数,一切就搞定了。

答案 10 :(得分:0)

使用 Pandas 和 json 库:

import pandas as pd
import json
filepath = "inputfile.csv"
output_path = "outputfile.json"

df = pd.read_csv(filepath)

# Create a multiline json
json_list = json.loads(df.to_json(orient = "records"))

with open(output_path, 'w') as f:
    for item in json_list:
        f.write("%s\n" % item)

答案 11 :(得分:-1)

import csv
import json
csvfile = csv.DictReader('filename.csv', 'r'))
output =[]
for each in csvfile:
    row ={}
    row['FirstName'] = each['FirstName']
    row['LastName']  = each['LastName']
    row['IDNumber']  = each ['IDNumber']
    row['Message']   = each['Message']
    output.append(row)
json.dump(output,open('filename.json','w'),indent=4,sort_keys=False)