我有以下格式的数据框,
ip_df=pd.DataFrame({'class':['I','II','III'],'details':[{'sec':'A','kinder':'yes'},{'sec':'B'}]
ip_df:
class details
0 I {'sec':'A','kinder':'yes'}
1 II {'sec':'B'}
2 III None
如何将字典键作为列名和字典值映射到其“详细信息”列的相应列?
op_df:
class detail sec kinder
0 I {'sec':'A','kinder':'yes'} A yes
1 II {'sec':'B'} B None
2 III None None None
答案 0 :(得分:1)
如果性能不重要,请将每一行转换为Series
:
ip_df = ip_df.join(ip_df['details'].apply(pd.Series))
print (ip_df)
class details sec kinder
0 I {'sec': 'A', 'kinder': 'yes'} A yes
1 II {'sec': 'B'} B NaN
2 III None NaN NaN
另一种解决方案是删除缺少的值或None
并由构造函数创建DataFrame
:
s = ip_df['details'].dropna()
ip_df = ip_df.join(pd.DataFrame(s.tolist(), index=s.index))
print (ip_df)
class details sec kinder
0 I {'sec': 'A', 'kinder': 'yes'} A yes
1 II {'sec': 'B'} B NaN
2 III None NaN NaN
在必要时最后一次将缺少的值转换为None
:
ip_df = ip_df.mask(ip_df.isna(), None)
print (ip_df)
class details sec kinder
0 I {'sec': 'A', 'kinder': 'yes'} A yes
1 II {'sec': 'B'} B None
2 III None None None