在json.dump中保留变量类型

时间:2015-03-12 19:57:22

标签: python json csv

我正在阅读一堆csv文件,其中的行包含字符和数字,如下所示:

"T55BSU","@","IT-196","IT","NO","@",1,385.82,1.4011825391667,"LFA","Economy", ...
"343OA3","A:1893BC6","FR-7139","FR","NO","@",1,805.01,1.4011825391667,"LFA","Economy", ...
...

我有一个小的python脚本循环文件并将其内容转储为JSON格式:

#!/usr/bin/python

import csv
import json

from os import listdir
from os.path import isfile, join

csvpath = "path to csv dir"
jsonpath = "path to json dir"

onlyfiles = [ f for f in listdir(csvpath) if isfile(join(csvpath,f)) ]

fieldnames = ("names of columns")

for files in onlyfiles:
    name = files.split('.')
    csvname = str(csvpath) + str(files)
    jsoname = str(jsonpath) + str(name[0]) + '.json'

    print "Opening " + str(csvname) + "\n"
    csvfile = open(csvname, 'r')

    print "Writing " + str(jsoname) + "\n"
    jsonfile = open(jsoname, 'w')

    reader = csv.DictReader(csvfile, fieldnames)

    for row in reader:
        json.dump(row, jsonfile)
        jsonfile.write('\n')

我的问题是JSON文件中的所有值都转换为字符串:

{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": "1.4011825391667", "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": "0", "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": "70"}

但是,我想:

{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": 1.4011825391667, "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": 0, "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": 70}

如何强制json.dump不将所有内容转换为字符串?在原始的csv文件中,它们被写为数字......

由于

2 个答案:

答案 0 :(得分:1)

问题不是json.dumps,而是csv阅读器。 每个值都被解释为字符串(Read data from csv-file and transform to correct data-type

如果你知道列的数据类型,你可以在阅读后转换它们:

#!/usr/bin/python

import csv
import json

csvfile = [
    '"name","age","grade"',
    '"ann",42,1.3',
    '"hans",23,1.7'
]
row_types = {'name': str, 'grade': float, 'age': int}

reader = csv.DictReader(csvfile)

jsonfile = open('test.json', 'w')
for row in reader:
    print('reader produces strings only:')
    print(row)
    print('convert to known types')
    row_converted = {k: row_types[k](v) for k, v in row.items()}
    print(row_converted)
    json.dump(row_converted, jsonfile)
    jsonfile.write('\n')

答案 1 :(得分:0)

这不是json.dump这里的错误 - 如果这是源词典包含的内容,它会很乐意记录整数对象。

它更多地位于CSV阅读器的一侧 - 它没有特殊的智能来确定哪些列应该自动转换为int,并将每个列视为字符串。 您需要对特定列进行后期处理,才能将其转换为int,根据以下帖子:CSV reader and DictReader turn numeric fields into strings