导入CSV行到字典中?

时间:2018-07-26 20:16:31

标签: python python-2.7

这个问题曾经被问过,但是在我的情况下,没有答案适用。

我正在读取带有以下内容的csv文件:

with codecs.open('./products.csv', 'r',  encoding="utf-8") as _filehandler:
    csv_file_reader = csv.DictReader(_filehandler)
    for row in csv_file_reader:

在我的CSV文件中,我有一行包含以下内容的列:

'#custom_shrink_wrapping': '700', '#custom_green_paper': '338'

我的目标是将其添加到字典中。上一栏是一栏。

以下是数据示例:

item,parse_dropdowns,fixed_dropdowns_values,links
postcards,"#quantity, #paper, #size, #color, #turnaround,  #coating","‘#custom_finishing': '497', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'",http://www.example.com/products/postcards
flyers,"#quantity, #paper, #size, #color, #turnaround, #coating, #folding","‘#custom_green_paper': '338', '#custom_hole_punch': '204', '#custom_shrink_wrapping': '700'",http://www.example.com/products/brochures
brochures,"#quantity, #paper, #size, #color, #turnaround, #coating, #folding","‘#custom_green_paper': '338', '#custom_hole_punch': '204', '#custom_shrink_wrapping': '700'",http://www.example.com/products/brochures
business cards,"#quantity, #paper, #size, #color, #turnaround,  #coating","‘#custom_green_paper': '338', '#custom_shrink_wrapping': '700', '#versionCustomerPulldown': '1'",http://www.example.com/products/businesscards
bookmarks,"#quantity, #paper, #size, #color, #turnaround,  #coating","‘#custom_finishing': '497', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'",http://www.example.com/products/bookmarks
calendars,"#quantity, #paper, #size, #color, #turnaround, #page, #coating","‘#custom_green_paper': '338', '#custom_finishing': '13356', '#custom_hole_punch': '205', '#custom_shrink_wrapping': '700'",http://www.example.com/products/calendars

最终目标是做到这一点:

{'#custom_shrink_wrapping': '700', '#custom_green_paper': '338'}

我认为这样做就很容易了

dropdownValuesCsv = dict()
dropdownValuesCsv.append( row['fixed_dropdowns_values'] )

那失败了。然后我尝试了这个:

dropdownValuesCsv = dict()
dropdownValuesCsv.update( row['fixed_dropdowns_values'] )

这产生了此错误:

ValueError: dictionary update sequence element #0 has length 1; 2 is required

然后我尝试了这个:

dropdownValuesCsv = { row['fixed_dropdowns_values'] }

但这会产生一个给出错误的集合,而不是我要寻找的集合。

2 个答案:

答案 0 :(得分:0)

我认为您的问题中只有一些混淆点可以为您解决。

首先,我不清楚当您说CSV包含一列值为'#custom_shrink_wrapping': '700', '#custom_green_paper': '338'的列时,因为其中包含逗号(并且您未指定其他分隔符),因此应该是两列?我将假设您已用另一个分隔符(例如,分号)替换了您的问题。 编辑:此后已通过添加的屏幕截图进行了澄清。

我相信您的主要问题是您没有考虑csv.DictReader的输出类型。当它解析您的csv时,它将把键映射到字符串,因此当您尝试用其输出更新字典时,您不会得到想要的多个值(仅一个字符串,这意味着这些值)。我们可以通过解析字符串来解决这个问题。

这是我的工作示例:

tmp.csv

fixed_dropdown_values, other
"'#custom_shrink_wrapping': '700', '#custom_green_paper': '338'", 'other': 'other'

test.py

import csv

dropdownValuesCSV = {}
with open('./tmp.csv') as file_handler:
    reader = csv.DictReader(file_handler)
    for row in reader:
        # We split multiple key-value pairs by comma
        for mapping in row['fixed_dropdown_values'].split(','):
            # We do an additional split on a colon to differentiate keys and values
            # ... and do some extra cleanup to remove extra spaces and quotation marks
            key, val = [s.strip(' ').replace("'", '') for s in mapping.split(':')]
            dropdownValuesCSV[key] = val
print dropdownValuesCSV
# dropdownValuesCSV is:
# {'#custom_green_paper': '338', '#custom_shrink_wrapping': '700'}

希望有帮助。

答案 1 :(得分:0)

编辑::已修改代码以使用ast.literal_eval()而不是更强大(甚至可能更危险)的内置eval(),因为这样做更安全,因为这就是这里所需要的。

请注意,为了进行测试,我根据添加到问题中的数据创建了一个products.csv文件-尽管我不得不将其中的个字符更改为'个字符,导致其中包含这(因为否则读取文件会导致codec错误):

item,parse_dropdowns,fixed_dropdowns_values,links
postcards,"#quantity, #paper, #size, #color, #turnaround,  #coating","'#custom_finishing': '497', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'",http://www.example.com/products/postcards
flyers,"#quantity, #paper, #size, #color, #turnaround, #coating, #folding","'#custom_green_paper': '338', '#custom_hole_punch': '204', '#custom_shrink_wrapping': '700'",http://www.example.com/products/brochures
brochures,"#quantity, #paper, #size, #color, #turnaround, #coating, #folding","'#custom_green_paper': '338', '#custom_hole_punch': '204', '#custom_shrink_wrapping': '700'",http://www.example.com/products/brochures
business cards,"#quantity, #paper, #size, #color, #turnaround,  #coating","'#custom_green_paper': '338', '#custom_shrink_wrapping': '700', '#versionCustomerPulldown': '1'",http://www.example.com/products/businesscards
bookmarks,"#quantity, #paper, #size, #color, #turnaround,  #coating","'#custom_finishing': '497', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'",http://www.example.com/products/bookmarks
calendars,"#quantity, #paper, #size, #color, #turnaround, #page, #coating","'#custom_green_paper': '338', '#custom_finishing': '13356', '#custom_hole_punch': '205', '#custom_shrink_wrapping': '700'",http://www.example.com/products/calendars

下面的代码可以读取它并执行您想要的操作:

import ast
import codecs
import csv

COL_NAME = 'fixed_dropdowns_values'  # Column of interest.

with codecs.open('./products.csv', 'rb',  encoding="utf-8") as _filehandler:
    csv_file_reader = csv.DictReader(_filehandler)
    for row in csv_file_reader:
        dropdownValuesCsv = ast.literal_eval('{' + row[COL_NAME] + '}')
        print(dropdownValuesCsv)

这是它在读取(修改的)输入时创建和打印的字典:

{'#custom_finishing': '497', '#custom_green_paper': '338', '#custom_shrink_wrapping': '700'}
{'#custom_hole_punch': '204', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'}
{'#custom_hole_punch': '204', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'}
{'#versionCustomerPulldown': '1', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'}
{'#custom_finishing': '497', '#custom_green_paper': '338', '#custom_shrink_wrapping': '700'}
{'#custom_hole_punch': '205', '#custom_finishing': '13356', '#custom_shrink_wrapping': '700', '#custom_green_paper': '338'}