如何将具有多个词典的文件转换为CSV文件

时间:2019-10-16 08:51:38

标签: python-3.x

我有一个带几个词典的文件。以下只是文件的一部分。我需要将其转换为一个csv文件,并最终将其加载到数据库中。我在将其转换为csv时遇到问题。

{"transaction_type": "new", "policynum": 4994949}
{"transaction_type": "renewal", "policynum": 3848848}
{"transaction_type": "cancel", "policynum": 49494949,  "cancel_table": 
[{"cancel_cd": "AU", "cancel_type": "online"}, {"cancel_cd": "AA", "cancel_type": "online"}]}

我尝试实现以下代码,但是cancel table键没有正确解析到csv文件中。

import ast

import csv

with open('***\\Python\\test', 'r') as in_f, open('***\\Python\\test.csv', 'w') as out_f:

    data = in_f.readlines()


    writer = csv.DictWriter(out_f, fieldnames=['transaction_type', 'policynum', 'cancel_table'], extrasaction='ignore')
    writer.writeheader()  # For writing header

    for row in data:
        dict_row = ast.literal_eval(row)  # because row has a dict string
        writer.writerow(dict_row)

以下是我使用取消表键得到的结果,该键未正确解析到csv文件中。我需要帮助来获取cancel_type和不同的cancel_cd作为单独的列。或者将cancel_cd用逗号分隔符连接在一列中(只是一个想法)。抱歉,如果这是一个已加载的问题。

transaction_type,policynum,cancel_table
new,4994949,
old,3848848,
cancel,49494949,"[{'cancel_type': 'online','cancel_cd': 'OL'}, 'cancel_type': 'Online','cancel_cd': 'BR'},{'cancel_type': 'online','cancel_cd': 'AU', }]"

1 个答案:

答案 0 :(得分:0)

假设cancel_table中的行始终同时包含cancel_cdcancel_type,要获得cancel_cdscancel_types作为单独的列,可以使用以下代码:

import ast
import csv

with open('Python/test', 'r') as in_f, open('Python/test.csv', 'w') as out_f:
    data = in_f.readlines()
    writer = csv.DictWriter(
        out_f,
        fieldnames=[
            'transaction_type', 'policynum', 'cancel_cds', 'cancel_types'
        ],
        extrasaction='ignore')
    writer.writeheader()

    for row in data:
        dict_row = ast.literal_eval(row)
        if 'cancel_table' in dict_row:
            cancel_table = dict_row['cancel_table']
            cancel_cds, cancel_types = [], []
            for cancel_row in cancel_table:
                cancel_cds.append(cancel_row['cancel_cd'])
                cancel_types.append(cancel_row['cancel_type'])
            dict_row['cancel_cds'] = ','.join(cancel_cds)
            dict_row['cancel_types'] = ','.join(cancel_types)
        writer.writerow(dict_row)

请确保您没有使用逗号作为csv的列分隔符,否则对于cancel_cdcancel_type的每个值,这将导致不同的列。