拥有以下数据框,
df = pd.DataFrame({'device_id' : ['0','0','1','1','2','2'],
'p_food' : [0.2,0.1,0.3,0.5,0.1,0.7],
'p_phone' : [0.8,0.9,0.7,0.5,0.9,0.3]
})
print(df)
输出:
device_id p_food p_phone
0 0 0.2 0.8
1 0 0.1 0.9
2 1 0.3 0.7
3 1 0.5 0.5
4 2 0.1 0.9
5 2 0.7 0.3
如何实现这种转变?
df2 = pd.DataFrame({'device_id' : ['0','1','2'],
'p_food_1' : [0.2,0.3,0.1],
'p_food_2' : [0.1,0.5,0.7],
'p_phone_1' : [0.8,0.7,0.9],
'p_phone_2' : [0.9,0.5,0.3]
})
print(df2)
输出:
device_id p_food_1 p_food_2 p_phone_1 p_phone_2
0 0 0.2 0.1 0.8 0.9
1 1 0.3 0.5 0.7 0.5
2 2 0.1 0.7 0.9 0.3
我尝试使用groupby,apply,agg ...实现它 但我仍然无法实现这一转变。
更新
我的最终代码:
df.drop_duplicates('device_id', keep='first').merge(df.drop_duplicates('device_id', keep='last'),on='device_id')
我很欣赏 su79eu7k 和 A-Za-z 的时间和精力。
言语不足以表达我的感激之情。
答案 0 :(得分:5)
如果您仍在使用groupby寻找答案
df = df.groupby('device_id')['p_food', 'p_phone'].apply(lambda x: pd.DataFrame(x.values)).unstack().reset_index()
df.columns = df.columns.droplevel()
df.columns = ['device_id','p_food_1', 'p_food_2', 'p_phone_1','p_phone_2']
你得到了
device_id p_food_1 p_food_2 p_phone_1 p_phone_2
0 0 0.2 0.1 0.8 0.9
1 1 0.3 0.5 0.7 0.5
2 2 0.1 0.7 0.9 0.3
答案 1 :(得分:2)
df_m = df.drop_duplicates('device_id', keep='first')\
.merge(df, on='device_id')\
.drop_duplicates('device_id', keep='last')\
[['device_id', 'p_food_x', 'p_food_y', 'p_phone_x', 'p_phone_y']]\
.reset_index(drop=True)
print(df_m)
device_id p_food_x p_food_y p_phone_x p_phone_y
0 0 0.2 0.1 0.8 0.9
1 1 0.3 0.5 0.7 0.5
2 2 0.1 0.7 0.9 0.3