我正在阅读一堆csv文件,其中的行包含字符和数字,如下所示:
"T55BSU","@","IT-196","IT","NO","@",1,385.82,1.4011825391667,"LFA","Economy", ...
"343OA3","A:1893BC6","FR-7139","FR","NO","@",1,805.01,1.4011825391667,"LFA","Economy", ...
...
我有一个小的python脚本循环文件并将其内容转储为JSON格式:
#!/usr/bin/python
import csv
import json
from os import listdir
from os.path import isfile, join
csvpath = "path to csv dir"
jsonpath = "path to json dir"
onlyfiles = [ f for f in listdir(csvpath) if isfile(join(csvpath,f)) ]
fieldnames = ("names of columns")
for files in onlyfiles:
name = files.split('.')
csvname = str(csvpath) + str(files)
jsoname = str(jsonpath) + str(name[0]) + '.json'
print "Opening " + str(csvname) + "\n"
csvfile = open(csvname, 'r')
print "Writing " + str(jsoname) + "\n"
jsonfile = open(jsoname, 'w')
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
我的问题是JSON文件中的所有值都转换为字符串:
{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": "1.4011825391667", "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": "0", "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": "70"}
但是,我想:
{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": 1.4011825391667, "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": 0, "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": 70}
如何强制json.dump不将所有内容转换为字符串?在原始的csv文件中,它们被写为数字......
由于
答案 0 :(得分:1)
问题不是json.dumps,而是csv阅读器。 每个值都被解释为字符串(Read data from csv-file and transform to correct data-type)
如果你知道列的数据类型,你可以在阅读后转换它们:
#!/usr/bin/python
import csv
import json
csvfile = [
'"name","age","grade"',
'"ann",42,1.3',
'"hans",23,1.7'
]
row_types = {'name': str, 'grade': float, 'age': int}
reader = csv.DictReader(csvfile)
jsonfile = open('test.json', 'w')
for row in reader:
print('reader produces strings only:')
print(row)
print('convert to known types')
row_converted = {k: row_types[k](v) for k, v in row.items()}
print(row_converted)
json.dump(row_converted, jsonfile)
jsonfile.write('\n')
答案 1 :(得分:0)
这不是json.dump
这里的错误 - 如果这是源词典包含的内容,它会很乐意记录整数对象。
它更多地位于CSV阅读器的一侧 - 它没有特殊的智能来确定哪些列应该自动转换为int,并将每个列视为字符串。
您需要对特定列进行后期处理,才能将其转换为int
,根据以下帖子:CSV reader and DictReader turn numeric fields into strings。