例如,我有以下数据集:
Name Number Is true
0 Dani 2 yes
1 Dani 2 no
2 Jack 5 no
3 Jack 5 maybe
4 Dani 2 maybe
我想创建一个新数据集,该数据集将合并相似的行并逐列添加不同的值。这是我想要获得的输出:
Name Number Is true1 Is true2 Is true3
0 Dani 2 yes no maybe
1 Jack 5 no maybe
在这里的示例10中,我无法正常工作: How to pivot a dataframe
请问您能为此用例提供一个具体示例吗?
谢谢。
编辑以回复:
Name yes no maybe
0 Dani 2 2 2
1 Jack NaN 5 5
答案 0 :(得分:0)
您可以尝试以下方法:
df2 = df.drop_duplicates(subset=['Name', 'Number Is'])
df2 = df2.reset_index(drop=True).assign(true= df.groupby('Number Is')['true'].agg(list).reset_index(drop=True) )
temp = df2['true'].apply(pd.Series).T
temp.index = temp.index+1
temp = temp.T
df2 = df2.assign(**temp.add_prefix('true').add_suffix(' Is')).drop(columns='true').fillna('')
输出:
Name Number Is true1 Is true2 Is true3 Is
0 Dani 2 yes no maybe
1 Jack 5 no maybe
答案 1 :(得分:0)
结合pivot_table(...)
和apply(...)
:
df.pivot_table(index=["Name", "Number"], values="Is true", aggfunc=list).apply(lambda x: pd.Series({f"Is true{id+1}": el for id, el in enumerate(x[0])}), axis=1).reset_index()
输出:
Name Number Is true1 Is true2 Is true3
0 Dani 2 yes no maybe
1 Jack 5 no maybe NaN
修改
为您跟进。这可能是您正在寻找的东西
df.pivot_table(index=["Name"], columns="Is true", values="Number", aggfunc=list).fillna('').apply(lambda x: pd.Series({f"{col}{id+1}": el for col in x.keys() for id, el in enumerate(x[col])}), axis=1).reset_index()
输出:
Name maybe1 no1 yes1
0 Dani 2.0 2.0 2.0
1 Jack 5.0 5.0 NaN