如何通过多个规则加快新列的形成

时间:2018-09-17 10:59:21

标签: python pandas dataframe

我将衡量过去几个月汽车经销商的销售情况,

我需要计算最近6个月和12个月的平均值和

我在python中执行此操作的方式是通过以下代码。

问题在于,当数据相对较大时,它需要永远处理,而当数据较小时,它就可以正常工作。

有什么天才的方法可以加快速度吗?

def ts(x,y):

    d[x+'sum12']=0
    d[x+'sum6']=0
    d[x+'year_flag']=0
    d[x+'sum_last12']=0
    d[x+'sum_last6']=0

    for i in range(0,len(d)):
        d.iloc[i][x+'sum6']=d.loc[(d.iloc[i]['PERIOD_ID']>=d['PERIOD_ID'])&\
                             (d['PERIOD_ID']>=d.iloc[i]['last6'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])]    [y].sum()
        d.iloc[i][x+'sum12']=d.loc[(d.iloc[i]['PERIOD_ID']>=d['PERIOD_ID'])&\
                             (d['PERIOD_ID']>=d.iloc[i]['last12'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()
        d.iloc[i][x+'sum_last6']=d.loc[(d['PERIOD_ID']>=(d.iloc[i]['last6']-100))&\
                             (d.iloc[i]['last12']>=d['PERIOD_ID'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()

        d.iloc[i][x+'sum_last12']=d.loc[(d['PERIOD_ID']>=(d.iloc[i]['PERIOD_ID']-100))&\
                             (d.iloc[i]['last12']>=d['PERIOD_ID'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()`enter code here`
        d.iloc[i][x+'year_flag']=d.iloc[i]['MTD_SAL_VOL_MKR_NUM']-d.loc[(d.iloc[i]['last12']==d['PERIOD_ID'])&\
                                                     (d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()

0 个答案:

没有答案