我有以下数据框:
test=pd.DataFrame({'MKV':[50,1000,80,20],
'Rating':['A','Z','A','A'],
'Sec':['I','I','I','F']})
test.groupby(['Rating','Sec'])['MKV'].apply(lambda x: x/x.sum())
gives results:
0 0.38
1 1.00
2 0.62
3 1.00
答案 0 :(得分:2)
我认为你不需要做groupby
。您可以使用set_index
和unstack
进行转化,然后对列进行标准化:
# Perform the pivot.
test = test.set_index(['Rating','Sec'], append=True).unstack(['Rating','Sec'])
# Normalize the columns.
test = test/test.sum()
# Rename columns as appropriate.
test.columns = [','.join(c[1:]) for c in test.columns]
结果输出:
A,I Z,I A,F
0 0.384615 NaN NaN
1 NaN 1.0 NaN
2 0.615385 NaN NaN
3 NaN NaN 1.0