我有一本这样的字典:
{'136454': [{'city': 'Kabul', 'country': 'AF'}],
'137824': [{'city': 'Kabul', 'country': 'AF'}],
'134134': [{'city': 'Kabul', 'country': 'AF'}],
'138322': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}],
'137246': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'],
'133141': [{'city': 'Kabul', 'country': 'AF'}]}
我想要的是一个看起来像这样的数据框:
'136454' | 'Kabul'|'AF'
'137824' | 'Kabul'|'AF'
'134134' | 'Kabul'|'AF'
'138322' |'Fujairah'| 'AE'
'138322' | 'Kabul'| 'AF'
'137246' | 'Fujairah'| 'AE'
'137246' | 'Kabul' | 'AE'
'137246' | 'New Delhi'| 'IN'
'133141'| 'Kabul'| 'AF'
此刻我得到的只是每个键的第一个值。熊猫不是很好,所以有点困惑。
答案 0 :(得分:5)
让我们做explode
,请注意此功能在0.25熊猫之后可用
df=pd.Series(d).explode().apply(pd.Series)
答案 1 :(得分:4)
遍历字典,将主键附加到内部字典,最后创建ur数据框:
d = []
for k,v in data.items():
for ent in v:
#this is where u append the main key to the internal dictionary
ent.update({"key":k})
d.append(ent)
#get ur dataframe
pd.DataFrame(d)
city country key
0 Kabul AF 136454
1 Kabul AF 137824
2 Kabul AF 134134
3 Fujairah AE 138322
4 Kabul AF 138322
5 Fujairah AE 137246
6 Kabul AF 137246
7 New Delhi IN 137246
8 Kabul AF 133141
答案 2 :(得分:1)
另一种可能的解决方案,您可以“平整”您的命令
data = {'136454': [{'city': 'Kabul', 'country': 'AF'}],
'137824': [{'city': 'Kabul', 'country': 'AF'}],
'134134': [{'city': 'Kabul', 'country': 'AF'}],
'138322': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}],
'137246': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'},
{'city': 'New Delhi', 'country': 'IN'}],
'133141': [{'city': 'Kabul', 'country': 'AF'}]}
new_data = []
for key, value in data.items():
for arr_value in value:
arr_value['id'] = key
new_data.append(arr_value)
print(new_data)
df = pd.DataFrame.from_dict(new_data)
print(df.head())
答案 3 :(得分:0)
您可以使用列表推导,然后传递给pd.DataFrame
:
import pandas as pd
d = {'136454': [{'city': 'Kabul', 'country': 'AF'}], '137824': [{'city': 'Kabul', 'country': 'AF'}], '134134': [{'city': 'Kabul', 'country': 'AF'}], '138322': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}], '137246': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'}], '133141': [{'city': 'Kabul', 'country': 'AF'}]}
data = [[a, i['city'], i['country']] for a, b in d.items() for i in b]
>>> pd.DataFrame(data)
输出:
0 1 2
0 136454 Kabul AF
1 137824 Kabul AF
2 134134 Kabul AF
3 138322 Fujairah AE
4 138322 Kabul AF
5 137246 Fujairah AE
6 137246 Kabul AF
7 137246 New Delhi IN
8 133141 Kabul AF