我在熊猫中有以下数据框
Date tank hose quantity count set flow
01-01-2018 1 1 20 100 211 12.32
01-01-2018 1 2 20 200 111 22.32
01-01-2018 1 3 20 200 123 42.32
02-01-2018 1 1 10 100 211 12.32
02-01-2018 1 2 10 200 111 22.32
02-01-2018 1 3 10 200 123 42.32
我想按quantity
和count
计算Date
和tank
分组的百分比。我想要的数据框
Date tank hose quantity count set flow perc_quant perc_count
01-01-2018 1 1 20 100 211 12.32 33.33 20
01-01-2018 1 2 20 200 111 22.32 33.33 40
01-01-2018 1 3 20 200 123 42.32 33.33 40
02-01-2018 1 1 10 100 211 12.32 25 20
02-01-2018 1 2 20 200 111 22.32 50 40
02-01-2018 1 3 10 200 123 42.32 25 40
我正在为实现这一目标而努力
test = df.groupby(['Date','tank']).apply(lambda x:
100 * x / float(x.sum()))
答案 0 :(得分:3)
将GroupBy.transform
和lambda函数一起使用,将add_prefix
和join
用作原始字符:
f = lambda x: 100 * x / float(x.sum())
df = df.join(df.groupby(['Date','tank'])['quantity','count'].transform(f).add_prefix('perc_'))
或指定新的列名称:
df[['perc_quantity','perc_count']] = (df.groupby(['Date','tank'])['quantity','count']
.transform(f))
print (df)
Date tank hose quantity count set flow perc_quantity \
0 01-01-2018 1 1 20 100 211 12.32 33.333333
1 01-01-2018 1 2 20 200 111 22.32 33.333333
2 01-01-2018 1 3 20 200 123 42.32 33.333333
3 02-01-2018 1 1 10 100 211 12.32 33.333333
4 02-01-2018 1 2 10 200 111 22.32 33.333333
5 02-01-2018 1 3 10 200 123 42.32 33.333333
perc_count
0 20.0
1 40.0
2 40.0
3 20.0
4 40.0
5 40.0