Python pandas - 为每列添加lambda

时间:2016-12-13 11:38:40

标签: python pandas dataframe

我有一个如下所示的数据框:

DayOfWeek  Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
00            0.0     0.0      0.0       19.0       0.0     4.0       0.0
01            0.0     0.0      0.0        0.0       0.0     7.0       0.0
07            0.0     0.0      3.0        5.0       3.0     0.0       1.0
08            0.0    17.0     16.0        8.0      10.0     1.0       0.0
09           10.0    48.0     30.0       86.0      12.0     3.0       0.0
10           70.0    58.0      3.0       36.0      52.0    70.0       0.0
11           32.0    26.0      0.0       20.0      38.0    42.0       0.0
12           21.0     9.0     83.0       32.0     129.0    57.0       0.0
13           53.0    51.0     55.0       36.0      18.0    32.0       0.0
14           64.0    62.0     24.0       21.0      53.0    61.0       0.0
15           46.0   121.0     37.0       31.0      58.0    54.0       0.0
16           95.0   139.0     86.0       58.0      79.0    11.0       0.0
17          113.0    56.0     73.0      146.0      78.0    17.0       0.0

我想把它作为百分比,所以我想对每一列进行求和,并且在每个单元格中我想要除以列的总和,所以我做了这段代码:

df_day = df_day.apply(lambda x: round(100 * x / df_day.groupby('DayOfWeek').size().sum()))

但它不起作用......

请问任何想法?

1 个答案:

答案 0 :(得分:3)

我认为您需要按div除以sum求和列,然后按mul划分多个列,如有必要round

print (df_day.sum())
Sunday       504.0
Monday       587.0
Tuesday      410.0
Wednesday    498.0
Thursday     530.0
Friday       359.0
Saturday       1.0
dtype: float64

print (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
           Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
DayOfWeek                                                                
0             0.0     0.0      0.0        4.0       0.0     1.0       0.0
1             0.0     0.0      0.0        0.0       0.0     2.0       0.0
7             0.0     0.0      1.0        1.0       1.0     0.0     100.0
8             0.0     3.0      4.0        2.0       2.0     0.0       0.0
9             2.0     8.0      7.0       17.0       2.0     1.0       0.0
10           14.0    10.0      1.0        7.0      10.0    19.0       0.0
11            6.0     4.0      0.0        4.0       7.0    12.0       0.0
12            4.0     2.0     20.0        6.0      24.0    16.0       0.0
13           11.0     9.0     13.0        7.0       3.0     9.0       0.0
14           13.0    11.0      6.0        4.0      10.0    17.0       0.0
15            9.0    21.0      9.0        6.0      11.0    15.0       0.0
16           19.0    24.0     21.0       12.0      15.0     3.0       0.0
17           22.0    10.0     18.0       29.0      15.0     5.0       0.0

使用apply缓慢解决方案:

print (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))

<强>计时

In [171]: %timeit (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
1000 loops, best of 3: 1.89 ms per loop

In [172]: %timeit (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
100 loops, best of 3: 5.18 ms per loop