根据Python DF中的其他2列计算滚动总和

时间:2018-08-29 10:55:06

标签: python pandas dataframe

我希望创建下表中的3个月滚动限制。该限制基于前缀和sic组合。因此,当AB 1到达12月时,我希望AB 1月的总和为12 +1 + 2

解决这个问题的最佳方法是什么?我使用过.rolling,但不确定如何处理前缀/ sic更改。

作为参考,我在“滚动3个月限制”列中手动输入了所需的答案。

+-------+--------+-----+--------+-----------------------+
| Month | prefix | sic | limits | Rolling 3 month Limit |
+-------+--------+-----+--------+-----------------------+
|     1 | AB     |   1 | 16.5   | 54.3                  |
|     2 | AB     |   1 | 22.6   | 68.2                  |
|     3 | AB     |   1 | 15.2   | 175.8                 |
|     4 | AB     |   1 | 30.4   | 360.2                 |
|     5 | AB     |   1 | 130.2  | 371                   |
|     6 | AB     |   1 | 199.6  | 262.5                 |
|     7 | AB     |   1 | 41.2   | 80.7                  |
|     8 | AB     |   1 | 21.7   | 61.2                  |
|     9 | AB     |   1 | 17.8   | 53.4                  |
|    10 | AB     |   1 | 21.7   | 53.4                  |
|    11 | AB     |   1 | 13.9   | 48.2                  |
|    12 | AB     |   1 | 17.8   | 56.9                  |
|     1 | AB     |  10 | 9.8    | 32.4                  |
|     2 | AB     |  10 | 9.8    | 134.2                 |
|     3 | AB     |  10 | 12.8   | 132.7                 |
|     4 | AB     |  10 | 111.6  | 276.9                 |
|     5 | AB     |  10 | 8.3    | 252.9                 |
|     6 | AB     |  10 | 157    | 244.6                 |
|     7 | AB     |  10 | 87.6   |                       |
+-------+--------+-----+--------+-----------------------+

1 个答案:

答案 0 :(得分:0)

import pandas as pd
d = {'Month':[1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,5,6,7],
     'prefix':['AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB','AB'],
     'sic':[1,1,1,1,1,1,1,1,1,1,1,1,10,10,10,10,10,10,10],
     'limits':[16.5,22.6,15.2,30.4,130.2,199.6,41.2,21.7,17.8,21.7,13.9,17.8,9.8,9.8,12.8,111.6,8.3,157,97.6],}
df = pd.DataFrame(d)
df['Rolling 3 month Limit'] = ''

def calc_roll(m,p,s):
   if m < 10: months = [m,m+1,m+2]
   if m == 10: months = [12,1,2]
   if m == 12: months = [12,1,2]
   if m == 11: months = [11,12,1]
   f = df.loc[(df['Month'].isin(months)) & (df['prefix'] == p) & (df['sic'] == s)]
   if len(f) < 3: return ''
   else: return sum(f['limits'])
df['Rolling 3 month Limit'] = df.apply(lambda x: calc_roll(x['Month'], x['prefix'], x['sic']),axis=1)


#Output

     Month   prefix  sic  limits Rolling 3 month Limit
 0       1     AB    1    16.5                  54.3
 1       2     AB    1    22.6                  68.2
 2       3     AB    1    15.2                 175.8
 3       4     AB    1    30.4                 360.2
 4       5     AB    1   130.2                   371
 5       6     AB    1   199.6                 262.5
 6       7     AB    1    41.2                  80.7
 7       8     AB    1    21.7                  61.2
 8       9     AB    1    17.8                  53.4
 9      10     AB    1    21.7                  56.9
 10     11     AB    1    13.9                  48.2
 11     12     AB    1    17.8                  56.9
 12      1     AB   10     9.8                  32.4
 13      2     AB   10     9.8                 134.2
 14      3     AB   10    12.8                 132.7
 15      4     AB   10   111.6                 276.9
 16      5     AB   10     8.3                 262.9
 17      6     AB   10   157.0                      
 18      7     AB   10    97.6                      

我已编写此代码以获取您希望的输出。让我知道是否有任何疑问!

编辑:

您可以调整if len(f) < 2:以获得准确的结果。