我的代码输出如下:
[{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
{'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
{'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
{'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715},
...
]
我想做的是获取“总人口”的所有值并将其存储在一个列表中。然后获取所有“总水冰盖”并将其存储在另一个列表中,依此类推。使用这样的数据结构,如何提取出这些值并将它们存储在单独的列表中?
谢谢
答案 0 :(得分:2)
如果您的目标是calculate Pearson's correlation,则应使用pandas
。
假设原始字典列表存储在名为output
的变量中。您可以使用以下命令轻松将其转换为pandas
DataFrame:
import pandas as pd
df = pd.DataFrame(output)
print(df)
# Total Barren Land Total Developed Total Forest Total Population: Total Water Ice Cover
#0 0.224399 17.205368 34.406421 4585 2.848142
#1 0.115141 37.271157 19.113414 4751 1.047784
#2 0.259720 23.504698 20.418608 3214 0.091666
#3 0.000000 66.375457 10.686713 5005 1.047784
现在您可以轻松生成相关矩阵:
# this is just to make the output print nicer
pd.set_option("precision",4) # only show 4 digits
# remove 'Total ' from column names to make printing smaller
df.rename(columns=lambda x: x.replace("Total ", ""), inplace=True)
corr = df.corr(method="pearson")
print(corr)
# Barren Land Developed Forest Population: Water Ice Cover
#Barren Land 1.0000 -0.9579 0.7361 -0.7772 0.4001
#Developed -0.9579 1.0000 -0.8693 0.5736 -0.6194
#Forest 0.7361 -0.8693 1.0000 -0.1575 0.9114
#Population: -0.7772 0.5736 -0.1575 1.0000 0.2612
#Water Ice Cover 0.4001 -0.6194 0.9114 0.2612 1.0000
现在您可以通过键访问各个相关性:
print(corr.loc["Forest", "Water Ice Cover"])
#0.91135717479534217
答案 1 :(得分:1)
我猜你可以使用类似的东西:
d = [{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
{'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
{'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
{'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715}]
f = {}
for l in d:
for k, v in l.items():
if not k in f:
f[k] = []
f[k].append(v)
print(f)
{'Total Population:': [4585, 4751, 3214, 5005], 'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0], 'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746], 'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0], 'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]}
答案 2 :(得分:1)
您可以使用pandas
:
pd.DataFrame(my_dict).to_dict(orient='list')
返回:
{'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715],
'Total Population:': [4585, 4751, 3214, 5005],
'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0]}
答案 3 :(得分:0)
调用您的词典列表dictionary_list
。然后:
keys = {k for d in dictionary_list for k in d.keys()}
list_of_values = [[v for d in dictionary_list for k, v in d.items() if k == key] for key in keys]
使用您的示例,输出:
[[17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
[0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
[2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0],
[4585, 4751, 3214, 5005],
[34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]]
如果您要使用相关值列表创建新的字典,请在第二行切换为:
new_dict = {key: [v for d in dictionary_list for k, v in d.items() if k == key] for key in keys}
答案 4 :(得分:0)
如果所有字典都具有相同的键,那么您可以使用第一个字典的键:
result = {k:[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()}
如果字典可以具有不同的键集,但是您可以使用不同长度的列表,那么我将使用defaultdict
来简化:
from collections import defaultdict
result = defaultdict(list)
for d in dictionary_list:
for k, v in d.items():
result[k].append(v)
如果字典可能具有不同的键集,并且您希望所有列表的长度都相同,则需要迭代两次。当密钥丢失时,您还需要某种占位符值来使用。如果我们要使用None
,我们可以这样做:
placeholder = None
keys = set()
for d in dictionary_list:
keys += set(d.keys())
result = {k:[] for k in keys}
for d in dictionary_list:
for k in keys:
result[k].append(d.get(k, placeholder))
在每种情况下,result
都是列表的决定。如果您想要一个列表列表,它实际上甚至更简单:
result = [[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()]
如果您希望所有列表的长度都相同并且包含占位符,那么您仍然需要使用列表字典作为中间步骤。但是从列表的字典转换为值的列表列表很容易:
list_of_lists_of_values = list(dict_of_lists_of_values.values())
也就是说,在Python 3.7之前,字典没有明确定义的迭代顺序,因此无论如何,您最好还是使用字典,因为否则很难确定您获得了正确的值(例如,不能保证“总人口”是第一批值。