我有一个pandas数据框,其中第一列是字典格式,尽管类型是对象。我想将此字段转换为附加到原始数据帧的3个单独字段 - 3个字段由字典中的3个键驱动;卡,球员和转身。我的数据框看起来像这样:
card player turn
0 {'name': 'Tap', 'cost': 2, 'id': '056'} me 2
1 {'name': 'Coin', 'cost': None, 'id': '051'} opponent 2
2 {'name': 'Pawnbroker', 'cost': 3,'id': '055'} 2
3 {'name': 'fire', 'cost': 2, 'id': 'E1_596'} me 3
4 {'name': 'Coil', 'cost': 1, 'id': 'E1_56'} me 3
5 {'name': 'Pawnbroker', 'cost': 3, 'id': 'E6'} 3
答案 0 :(得分:3)
假设您的词典列被称为' foo':
df = pd.concat([df, df['foo'].apply(pd.Series)], axis=1)
# card foo player turn cost id name
#0 me {'cost': 2, 'id': '056', 'name': 'Tap'} 2 2.0 056 Tap
#1 opponent {'cost': None, 'id': '051', 'name': 'Coin'} 2 NaN 051 Coin
您现在可以删除不需要的列:
del df['foo']; print(df)
# card player turn cost id name
#0 me 2 2.0 056 Tap
#1 opponent 2 NaN 051 Coin
答案 1 :(得分:1)
您可以使用pop
删除card
列,然后DataFrame constructor
添加concat
:
print (pd.concat([df, pd.DataFrame(df.pop('card').values.tolist())],axis=1))
player turn cost id name
0 me 2.0 2.0 056 Tap
1 opponent 2.0 NaN 051 Coin
2 2 NaN 3.0 055 Pawnbroker
3 me 3.0 2.0 E1_596 fire
4 me 3.0 1.0 E1_56 Coil
5 3 NaN 3.0 E6 Pawnbroker
<强>计时强>:
#[6000 rows x 3 columns]
df = pd.concat([df]*1000).reset_index(drop=True)
In [391]: %timeit (df['card'].apply(pd.Series))
1 loop, best of 3: 1.26 s per loop
In [392]: %timeit (pd.DataFrame(df['card'].values.tolist()))
100 loops, best of 3: 6.72 ms per loop