如何将数据帧中以字典格式转换为单独列的列?

时间:2017-03-01 15:04:28

标签: python pandas dataframe

我有一个pandas数据框,其中第一列是字典格式,尽管类型是对象。我想将此字段转换为附加到原始数据帧的3个单独字段 - 3个字段由字典中的3个键驱动;卡,球员和转身。我的数据框看起来像这样:

                                                 card    player  turn
0   {'name': 'Tap', 'cost': 2, 'id': '056'}        me     2
1   {'name': 'Coin', 'cost': None, 'id': '051'}  opponent     2
2   {'name': 'Pawnbroker', 'cost': 3,'id': '055'}     2
3   {'name': 'fire', 'cost': 2, 'id': 'E1_596'}        me     3
4   {'name': 'Coil', 'cost': 1, 'id': 'E1_56'}        me     3
5   {'name': 'Pawnbroker', 'cost': 3, 'id': 'E6'}     3

2 个答案:

答案 0 :(得分:3)

假设您的词典列被称为' foo':

df = pd.concat([df, df['foo'].apply(pd.Series)], axis=1)
#   card                                          foo  player turn  cost    id  name
#0        me  {'cost': 2, 'id': '056', 'name': 'Tap'}       2        2.0   056   Tap 
#1  opponent  {'cost': None, 'id': '051', 'name': 'Coin'}   2        NaN   051  Coin

您现在可以删除不需要的列:

del df['foo']; print(df)
#       card  player turn  cost   id  name
#0        me       2        2.0  056   Tap
#1  opponent       2        NaN  051  Coin

答案 1 :(得分:1)

您可以使用pop删除card列,然后DataFrame constructor添加concat

print (pd.concat([df, pd.DataFrame(df.pop('card').values.tolist())],axis=1))
     player  turn  cost      id        name
0        me   2.0   2.0     056         Tap
1  opponent   2.0   NaN     051        Coin
2         2   NaN   3.0     055  Pawnbroker
3        me   3.0   2.0  E1_596        fire
4        me   3.0   1.0   E1_56        Coil
5         3   NaN   3.0      E6  Pawnbroker

<强>计时

#[6000 rows x 3 columns]
df = pd.concat([df]*1000).reset_index(drop=True)

In [391]: %timeit (df['card'].apply(pd.Series))
1 loop, best of 3: 1.26 s per loop

In [392]: %timeit (pd.DataFrame(df['card'].values.tolist()))
100 loops, best of 3: 6.72 ms per loop