我有一个如下所示的数据框:
DayOfWeek Sunday Monday Tuesday Wednesday Thursday Friday Saturday
00 0.0 0.0 0.0 19.0 0.0 4.0 0.0
01 0.0 0.0 0.0 0.0 0.0 7.0 0.0
07 0.0 0.0 3.0 5.0 3.0 0.0 1.0
08 0.0 17.0 16.0 8.0 10.0 1.0 0.0
09 10.0 48.0 30.0 86.0 12.0 3.0 0.0
10 70.0 58.0 3.0 36.0 52.0 70.0 0.0
11 32.0 26.0 0.0 20.0 38.0 42.0 0.0
12 21.0 9.0 83.0 32.0 129.0 57.0 0.0
13 53.0 51.0 55.0 36.0 18.0 32.0 0.0
14 64.0 62.0 24.0 21.0 53.0 61.0 0.0
15 46.0 121.0 37.0 31.0 58.0 54.0 0.0
16 95.0 139.0 86.0 58.0 79.0 11.0 0.0
17 113.0 56.0 73.0 146.0 78.0 17.0 0.0
我想把它作为百分比,所以我想对每一列进行求和,并且在每个单元格中我想要除以列的总和,所以我做了这段代码:
df_day = df_day.apply(lambda x: round(100 * x / df_day.groupby('DayOfWeek').size().sum()))
但它不起作用......
请问任何想法?
答案 0 :(得分:3)
我认为您需要按div
除以sum
求和列,然后按mul
划分多个列,如有必要round
:
print (df_day.sum())
Sunday 504.0
Monday 587.0
Tuesday 410.0
Wednesday 498.0
Thursday 530.0
Friday 359.0
Saturday 1.0
dtype: float64
print (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
DayOfWeek
0 0.0 0.0 0.0 4.0 0.0 1.0 0.0
1 0.0 0.0 0.0 0.0 0.0 2.0 0.0
7 0.0 0.0 1.0 1.0 1.0 0.0 100.0
8 0.0 3.0 4.0 2.0 2.0 0.0 0.0
9 2.0 8.0 7.0 17.0 2.0 1.0 0.0
10 14.0 10.0 1.0 7.0 10.0 19.0 0.0
11 6.0 4.0 0.0 4.0 7.0 12.0 0.0
12 4.0 2.0 20.0 6.0 24.0 16.0 0.0
13 11.0 9.0 13.0 7.0 3.0 9.0 0.0
14 13.0 11.0 6.0 4.0 10.0 17.0 0.0
15 9.0 21.0 9.0 6.0 11.0 15.0 0.0
16 19.0 24.0 21.0 12.0 15.0 3.0 0.0
17 22.0 10.0 18.0 29.0 15.0 5.0 0.0
使用apply
缓慢解决方案:
print (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
<强>计时强>:
In [171]: %timeit (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
1000 loops, best of 3: 1.89 ms per loop
In [172]: %timeit (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
100 loops, best of 3: 5.18 ms per loop