这是数据帧。其中包含一些带有字典的单元格。我想将字典项目转换为列
dfx={'name':['Alex','Jin',np.nan,'Peter'],
'age':[np.nan,10,12,13],
'other':[{'school':'abc','subject':'xyz'},
np.nan,
{'school':'abc','subject':'xyz'},
np.nan,]
}
dfx=pd.DataFrame(dfx)
输出
name age other
Alex {'school': 'abc', 'subject': 'xyz'}
Jin 10.0
12.0 {'school': 'abc', 'subject': 'xyz'}
Peter 13.0
这是所需的输出
name age school subject
Alex abc xyz
Jin 10.0
12.0 abc xyz
Peter 13.0
答案 0 :(得分:2)
您可以使用.str.get
访问器来实际索引列中的字典。每当单元格值为nan
而不是字典时,它也会返回nan
:
clean_df = (dfx
.assign(
school=lambda df: df["other"].str.get("school"),
subject=lambda df: df["other"].str.get("subject"))
.drop("other", axis=1))
print(clean_df)
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN
答案 1 :(得分:2)
尝试一下
df_final = dfx[['name','age']].assign(**pd.DataFrame(dfx.other.to_dict()).T)
Out[41]:
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN
答案 2 :(得分:1)
从dictionary
的{{1}}和dfx
创建一个index
。 other
字典和pd.DataFrame
。这将为您提供一个新的transpose
。将产生的dataframe
连接到dfx的前两列。
dataframe
答案 3 :(得分:0)
您可以将Series
应用于带有字典的列:
df.drop('other', 1).join(df['other'].apply(pd.Series).drop(0, 1))
输出:
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN