读取CSV并输出json。需要设置数据类型

时间:2019-04-07 18:59:33

标签: python json csv

我有一个项目,我正在读取CSV并将其输出到json。

以下是一些示例CSV:

firstName,lastName,email,age,gender
John,Doe,jdoe@emaildomain.com,50,male
Jane,Doe,jdoe@emaildomain.com,28,female
Bill,Smith,bsmith@emaildomain.com,49,male
Dick,Tracy,dtracy@emaildomain.com,18,male
Peter,Parker,pparker@emaildomain.com,26,male
Clark,Kent,ckent@emaildomain.com,17,male
Wonder,Woman,wwoman@emaildomain.com,44,female
John,James,jjames@emaildomain.com,17,male
Kat,Whoaman,kwhoamans@emaildomain.com,23,female

一切都按照我希望的那样工作,除了我需要某些值作为输出中的整数,但它们以字符串形式出现(例如年龄)。有没有办法使我掌握的大部分代码保持完整,但将某些值输出为整数而不是字符串?

import json
import csv
import itertools


primary_field = ['email']
result = []
with open('SampleCSV.csv') as csv_file:
    reader = csv.DictReader(csv_file, skipinitialspace=True)
    for row in itertools.islice(reader, 5):
        d = {k: v for k, v in row.items() if k in primary_field}
        d['dataFields'] = [{k: v,} for k, v in row.items() if k not in primary_field]
        result.append(d)

root = {}
root["users"] = result
print(json.dumps(root, indent=4))

示例输出:

{
    "users": [
        {
            "email": "jdoe@emaildomain.com",
            "dataFields": [
                {
                    "firstName": "John"
                },
                {
                    "lastName": "Doe"
                },
                {
                    "age": "50"
                },
                {
                    "gender": "male"
                }
            ]
        }
    ]
}

所需的输出:

{
    "users": [
        {
            "email": "jdoe@emaildomain.com",
            "dataFields": [
                {
                    "firstName": "John"
                },
                {
                    "lastName": "Doe"
                },
                {
                    "age": 50
                },
                {
                    "gender": "male"
                }
            ]
        }
    ]
}

1 个答案:

答案 0 :(得分:0)

这是我之前提到的。注释掉的行是您的原始代码。

import json
import csv
import itertools

primary_field = ['email']
result = []
with open('SampleCSV.csv') as csv_file:
    reader = csv.DictReader(csv_file, skipinitialspace=True)
    for row in itertools.islice(reader, 5):
        d = {k: v for k, v in row.items() if k in primary_field}
        # d['dataFields'] = [{k: v,} for k, v in row.items() if k not in primary_field]
        tmp_list = []
        for k,v  in row.items():
            if k not in primary_field:
                try:
                    vint = int(v)
                except ValueError:
                    vint = v
                tmp_list.append({k: vint})
        d['dataFields'] = tmp_list
        result.append(d)

root = {}
root["users"] = result
print(json.dumps(root, indent=4))

给出结果

{
"users": [
    {
        "email": "jdoe@emaildomain.com",
        "dataFields": [
            {
                "firstName": "John"
            },
            {
                "lastName": "Doe"
            },
            {
                "age": 50
            },
            {
                "gender": "male"
            }
        ]
    }, ...