Question

假设我有多个字典，类似……

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

我的目标是将其以以下格式写入文件

           Action       Comedy       Horror      
list_one  meanScore   meanScore    
           amount       amount       
list_two              meanScore     meanScore
                        amount       amount

我对dict以及存储它们的最佳方法不是很熟悉，但是似乎csv-文件非常受欢迎。我尝试使用this answer here解决我的问题，但是我很难理解@MarkLongair的功能以及如何将其扩展到我的问题。与我有关的主要问题之一是，并非每种流派都属于每个列表的一部分，所以我不知道如何检查现有csv文件（如果密钥存在），密钥的位置以及如何将值写入到右列。

由于我无法真正理解链接的答案，因此尝试了

from pandas import DataFrame

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82},
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, 
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

DataFrame(list_one).to_csv('test.csv')
DataFrame(list_two).to_csv('test.csv')

这实际上并没有用，因为数据被覆盖了，而我想成为列的东西变成了行...

我不确定在这里如何进行表格或正确的方向是什么...有人可以帮忙吗？

Answer 1

不使用Pandas即可解决此问题的一种方法[编辑：我看到您从编辑起就提到了这一点]是使函数能够查看您的词典之一，并撰写适当的CSV文本行。

def generate_row(separator, headers, data_type, data_list, list_name):
    data_by_genre = {k: '' for k in headers}
    for data in data_list:
        data_by_genre[data['genre']] = str(data[data_type])

    output_text = separator.join([data_by_genre[genre] for genre in headers]) + '\n'
    # If it's 'amount', then the row starts with the name. Otherwise that space is blank.
    if data_type == 'amount':
        output_text = list_name + output_text

    return output_text


list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]
list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

headers = ['', 'Action', 'Comedy', 'Horror']
separator = ','

f = open('new.csv', 'w')
f.write(separator.join(headers))
f.write('\n')
f.write(generate_row(separator, headers, 'amount', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'meanScore', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'amount', list_two, 'list_two'))
f.write(generate_row(separator, headers, 'meanScore', list_two, 'list_two'))
f.close()

我将“分隔符”设置为变量，以防您想使用例如制表符分隔而不是逗号。

但是，如果您想使用Pandas，则可以编写一些内容以重新格式化数据，使其看起来像这样，这样它就可以“正确地”写出。

data1 = [{'Action': 141, 'Comedy': 191, 'Horror': None},
         {'Action': 82, 'Comedy': 82, 'Horror': None},
         {'Action': None, 'Comedy': 191, 'Horror': 11},
         {'Action': None, 'Comedy': 82, 'Horror': 62}]

DataFrame(data1).to_csv('test.csv')

Answer 2

在第一个问题版本中，您没有提到您是在熊猫内运行的，这确实与Python标准库和重要信息不同。确实并不需要熊猫来做，但是我认为您是出于其他原因使用熊猫。

DataFrame(list1 + list2).to_csv('test.csv')

另请参见

How to add pandas data to an existing csv file?

如果您想在编写时追加而不是合并列表，然后再转换成数据框。

熊猫以外的其他解决方案将是csv库中的csv.DictWriter或JSON序列化（如果不需要CSV）。

如何以某种格式将字典写入文件

2 个答案: