Pandas-如果动态地基于行值求和

时间:2017-02-22 17:02:35

标签: python python-3.x pandas

问题: 我需要平均给定年份中每5周一次的商品和商店销售额。平均值也必须能够“循环”多周的1,2,51,52周。

例如;第1周将是平均周数:51,52,1,2和3,产品72243000016,在商店10103中。

    >>> df
        weekNumber   productNumber  storeNumber  Sales
0           47.0  72243000016         10103      93.80
1           47.0  72243000016         10148      97.43
2           47.0  72243000016         10153     114.01
3           47.0  72243000016         10216     154.75
4           47.0  72243000016         10243      55.74
5           47.0  72243000016         10260      52.74
6           47.0  72243000016         10266     104.38
7           47.0  72243000016         10275      80.06
8           47.0  72243000016         10327      40.11
9           47.0  72243000016         10375      57.32
10          47.0  72243000016         10402      25.58
11          47.0  72243000016         10407      51.32
12          47.0  72243000016         10412      13.58
13          47.0  72243000016         10436      86.22
14          47.0  72243000016         10537      32.53
15          47.0  72243000016         10588      41.37
16          47.0  72243010016         10103      76.27
17          47.0  72243010016         10148      61.27
18          47.0  72243010016         10153      96.64
19          47.0  72243010016         10216      75.48
20          47.0  72243010016         10243      39.95
21          47.0  72243010016         10260      47.53
22          47.0  72243010016         10266      37.74
23          47.0  72243010016         10275      56.69
24          47.0  72243010016         10327      17.37
25          47.0  72243010016         10375      22.58
26          47.0  72243010016         10402      29.53
27          47.0  72243010016         10436      46.11
28          47.0  72243010016         10537      27.16
29          47.0  72243010016         10588      33.16
...          ...          ...           ...        ...
118039       5.0  85005700315         10275      30.72
118040       5.0  85005700315         10402      11.97
118041       5.0  85005700315         10436      35.51
118042       5.0  85005700315         10412      19.95
118043       5.0  85005700315         10148      67.43
118044       5.0  85005700315         10260      47.48
118045       5.0  85005700315         10103      67.43
118046       5.0  85005700315         10327       7.98
118047       5.0  85005700315         10216      83.79
118048       5.0  85005700319         10637      19.95
118049       5.0  85005700319         10266      23.94
118050       5.0  85005700319         10537      19.95
118051       5.0  85005700319         10243      39.90
118052       5.0  85005700319         10275      35.51
118053       5.0  85005700319         10402      15.96
118054       5.0  85005700319         10436      35.51
118055       5.0  85005700319         10148      19.95
118056       5.0  85005700319         10103     119.60
118057       5.0  85005700319         10327       3.99
118058       5.0  85005700319         10216     151.42
118060       5.0  85005700324         10260      42.99
118061       5.0  85005700340         10637      63.84
118062       5.0  85005700340         10266      47.88
118063       5.0  85005700340         10537       7.98
118064       5.0  85005700340         10275      90.97
118065       5.0  85005700340         10402      23.84
118066       5.0  85005700340         10436     103.34
118067       5.0  85005700340         10148      43.09
118068       5.0  85005700340         10103     147.03
118069       5.0  85005700340         10327       7.88

这是有效的,但我相信有一种更有效的方法。

>>> def querythis(weekNumber, productNumber, storeNumber):
    a=[-2,-1,0,1,2]
    weekNumber=[(52+weekNumber+x)%52 for x in a]
    weekNumber=[52 if x==0 else x for x in weekNumber]
    WeekSum=0
    count=0
    for i in weekNumber:
        YoYWeek=df[(df['weekNumber']==i)\
                 &(df['productNumber']==productNumber)\
                 &(df['Store Number']==storeNumber)][[3]].mean()
        if YoYWeek.any():
            WeekSum= WeekSum + YoYWeek
            count=count+1
    return WeekSum/count

>>> df['Rolling Average']=df.apply(lambda row:querythis(row['week_num'],row['productNumber'],row['Store Number']), axis=1)

0 个答案:

没有答案