将嵌套字典变成熊猫数据框

时间:2020-05-20 14:28:20

标签: python pandas dictionary

我有一本这样的字典:

{'136454': [{'city': 'Kabul', 'country': 'AF'}],
 '137824': [{'city': 'Kabul', 'country': 'AF'}],
 '134134': [{'city': 'Kabul', 'country': 'AF'}],
 '138322': [{'city': 'Fujairah', 'country': 'AE'},
  {'city': 'Kabul', 'country': 'AF'}],
 '137246': [{'city': 'Fujairah', 'country': 'AE'},
  {'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'],
 '133141': [{'city': 'Kabul', 'country': 'AF'}]}

我想要的是一个看起来像这样的数据框:

'136454' | 'Kabul'|'AF'
'137824' | 'Kabul'|'AF'
'134134' | 'Kabul'|'AF'
'138322' |'Fujairah'| 'AE'
'138322'  | 'Kabul'| 'AF'
'137246' | 'Fujairah'| 'AE'
'137246' | 'Kabul' | 'AE'
'137246' | 'New Delhi'| 'IN'
'133141'| 'Kabul'| 'AF'

此刻我得到的只是每个键的第一个值。熊猫不是很好,所以有点困惑。

4 个答案:

答案 0 :(得分:5)

让我们做explode,请注意此功能在0.25熊猫之后可用

df=pd.Series(d).explode().apply(pd.Series)

答案 1 :(得分:4)

遍历字典,将主键附加到内部字典,最后创建ur数据框:

d = []
for k,v in data.items():
    for ent in v:
        #this is where u append the main key to the internal dictionary
        ent.update({"key":k})
        d.append(ent)

#get ur dataframe 
pd.DataFrame(d)

    city      country   key
0   Kabul       AF     136454
1   Kabul       AF     137824
2   Kabul       AF     134134
3   Fujairah    AE     138322
4   Kabul       AF     138322
5   Fujairah    AE     137246
6   Kabul       AF     137246
7   New Delhi   IN     137246
8   Kabul       AF     133141

答案 2 :(得分:1)

另一种可能的解决方案,您可以“平整”您的命令

data = {'136454': [{'city': 'Kabul', 'country': 'AF'}],
        '137824': [{'city': 'Kabul', 'country': 'AF'}],
        '134134': [{'city': 'Kabul', 'country': 'AF'}],
        '138322': [{'city': 'Fujairah', 'country': 'AE'},
                   {'city': 'Kabul', 'country': 'AF'}],
        '137246': [{'city': 'Fujairah', 'country': 'AE'},
                   {'city': 'Kabul', 'country': 'AF'},
                   {'city': 'New Delhi', 'country': 'IN'}],
        '133141': [{'city': 'Kabul', 'country': 'AF'}]}


new_data = []
for key, value in data.items():
    for arr_value in value:
        arr_value['id'] = key
        new_data.append(arr_value)

print(new_data)

df = pd.DataFrame.from_dict(new_data)

print(df.head())

答案 3 :(得分:0)

您可以使用列表推导,然后传递给pd.DataFrame

import pandas as pd
d = {'136454': [{'city': 'Kabul', 'country': 'AF'}], '137824': [{'city': 'Kabul', 'country': 'AF'}], '134134': [{'city': 'Kabul', 'country': 'AF'}], '138322': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}], '137246': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'}], '133141': [{'city': 'Kabul', 'country': 'AF'}]}
data = [[a, i['city'], i['country']] for a, b in d.items() for i in b]

>>> pd.DataFrame(data)

输出:

       0          1   2
0  136454      Kabul  AF
1  137824      Kabul  AF
2  134134      Kabul  AF
3  138322   Fujairah  AE
4  138322      Kabul  AF
5  137246   Fujairah  AE
6  137246      Kabul  AF
7  137246  New Delhi  IN
8  133141      Kabul  AF