如何将嵌套字典转换为2D表格
data [0] 是表格行的集合
data [0] [0] 是一个表格行,
键:年是列名
键:值是列中的值。
我想在Pandas数据框中将数据[0] 恢复为表格形式。
我发现 json_normalize 可能有所帮助,但不知道该怎么做。
有什么建议吗?
(Pdb++) data[0]
[{u'values': [{u'Actual': u'(0.2)'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2009'}, {u'values': [{u'Actual': u'2.8'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2010'}, {u'values': [{u'Actual': u'2.0'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2011'}, {u'values': [{u'Actual': u'2.0'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2012'}, {u'values': [{u'Actual': u'2.5'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2013'}, {u'values': [{u'Actual': u'-'}, {u'Upper End of Range': u'3.0'}, {u'Upper End of Central Tendency': u'3.0'}, {u'Lower End of Central Tendency': u'2.8'}, {u'Lower End of Range': u'2.1'}], u'year': u'2014'}, {u'values': [{u'Actual': u'-'}, {u'Upper End of Range': u'3.5'}, {u'Upper End of Central Tendency': u'3.2'}, {u'Lower End of Central Tendency': u'3.0'}, {u'Lower End of Range': u'2.2'}], u'year': u'2015'}, {u'values': [{u'Actual': u'-'}, {u'Upper End of Range': u'3.4'}, {u'Upper End of Central Tendency': u'3.0'}, {u'Lower End of Central Tendency': u'2.5'}, {u'Lower End of Range': u'2.2'}], u'year': u'2016'}, {u'values': [{u'Actual': u'-'}, {u'Upper End of Range': u'2.4'}, {u'Upper End of Central Tendency': u'2.3'}, {u'Lower End of Central Tendency': u'2.2'}, {u'Lower End of Range': u'1.8'}], u'year': u'Longer Run'}]
(Pdb++) data[0][0]
{u'values': [{u'Actual': u'(0.2)'}, {u'Upper End of Range': u'-'}, {u'Upper End of Central Tendency': u'-'}, {u'Lower End of Central Tendency': u'-'}, {u'Lower End of Range': u'-'}], u'year': u'2009'}
也许更改JSON架构会是更好的解决方案吗?
如果是这样,那种新的JSON模式设计对于表数据的类型更好。感谢
答案 0 :(得分:1)
import pandas
# set up data structures
columns = [
"year",
"actual",
"upper",
"upper_central",
"lower_central",
"lower"
]
value_getter = {
"year" : lambda item: item['year'],
"actual" : lambda item: item['values'][0]['Actual'],
"upper" : lambda item: item['values'][1]['Upper End of Range'],
"upper_central": lambda item: item['values'][2]['Upper End of Central Tendency'],
"lower_central": lambda item: item['values'][3]['Lower End of Central Tendency'],
"lower" : lambda item: item['values'][4]['Lower End of Range']
}
mydata = {
"year" : [],
"actual" : [],
"upper" : [],
"upper_central": [],
"lower_central": [],
"lower" : []
}
# repackage data
for item in data[0]:
for column in columns:
mydata[column].append(value_getter[column](item))
# and stuff it into pandas
df = pandas.DataFrame(mydata, columns=columns)
然后df.T
给出
0 1 2 3 4 5 6 7 8
year 2009 2010 2011 2012 2013 2014 2015 2016 Longer Run
actual (0.2) 2.8 2.0 2.0 2.5 - - - -
upper - - - - - 3.0 3.5 3.4 2.4
upper_central - - - - - 3.0 3.2 3.0 2.3
lower_central - - - - - 2.8 3.0 2.5 2.2
lower - - - - - 2.1 2.2 2.2 1.8
答案 1 :(得分:0)
为了提高效率,您应该初始化数据框,但如果您的数据集很小,并且您不知道最内部词典中出现的所有可能的字符串,则无需这样做。
import pandas as pd
df=pd.DataFrame
for dict1 in data[0]:
for dict2 in dict1['values']:
for key,val in zip(dict2.keys(),dict2.values()):
df.loc[key,dict1['year']]=val
df