从当前一周到前一周的最后一周的周差

时间:2017-05-02 21:38:11

标签: pandas datetime

我有一个使用pivot_table方法从另一个pandas数据框(按商店销售)创建的数据透视pandas数据框(按地区销售)。

举个例子:

df = pd.DataFrame(
    {'store':['A','B','C','D','E']*7, 
     'region':['NW','NW','SW','NE','NE']*7, 
     'date':['2017-03-30']*5+['2017-04-05']*5+['2017-04-07']*5+['2017-04-12']*5+['2017-04-13']*5+['2017-04-17']*5+['2017-04-20']*5,
     'sales':[30,1,133,9,1,30,3,135,9,11,30,1,140,15,15,25,10,137,9,3,29,10,137,9,11,30,19,145,20,10,30,8,141,25,25]
     })
df_sales = df.pivot_table(index = ['region'], columns = ['date'], aggfunc = [np.sum], margins = True)
df_sales = df_sales.ix[:,range(0, df_sales.shape[1]-1)]

我的目标是对销售数据框执行以下操作。

添加一个名为周差的列,用于计算本周总销售额与上周的最新值(按日期)之间的差异。假设:我总是有一周的数据,但不是固定的日子。 随着新数据的出现,周差异列将有所不同,但最新数据如下所示:

>>> df_sales
              sum                                                         \
            sales                                                          
date   2017-03-30 2017-04-05 2017-04-07 2017-04-12 2017-04-13 2017-04-17   
region                                                                     
NE           10.0       20.0       30.0       12.0       20.0       30.0   
NW           31.0       33.0       31.0       35.0       39.0       49.0   
SW          133.0      135.0      140.0      137.0      137.0      145.0   
All         174.0      188.0      201.0      184.0      196.0      224.0   



date   2017-04-20 WeekDifference 
region             
NE           50.0    50.0-20.0
NW           38.0    38.0-39.0
SW          141.0    141.0-137.0
All         229.0    229-196.0

因为它是最近一天和前一周的最后一天之间的差异。在这个具体的例子中,我们在一周2017-04-20,上一周的数据的最后一天是2017-04-13。

我希望在数据更新时以一般方式执行此操作。

1 个答案:

答案 0 :(得分:1)

df = pd.DataFrame(
    {'store':['A','B','C','D','E']*7, 
     'region':['NW','NW','SW','NE','NE']*7, 
     'date':['2017-03-30']*5+['2017-04-05']*5+['2017-04-07']*5+['2017-04-12']*5+['2017-04-13']*5+['2017-04-17']*5+['2017-04-20']*5,
     'sales':[30,1,133,9,1,30,3,135,9,11,30,1,140,15,15,25,10,137,9,3,29,10,137,9,11,30,19,145,20,10,30,8,141,25,25]
     })
df_sales = df.pivot_table(index = ['region'], columns = ['date'], aggfunc = [np.sum], margins = True)
df_sales = df_sales.ix[:,range(0, df_sales.shape[1]-1)]

输入:

              sum                                                         \
            sales                                                          
date   2017-03-30 2017-04-05 2017-04-07 2017-04-12 2017-04-13 2017-04-17   
region                                                                     
NE           10.0       20.0       30.0       12.0       20.0       30.0   
NW           31.0       33.0       31.0       35.0       39.0       49.0   
SW          133.0      135.0      140.0      137.0      137.0      145.0   
All         174.0      188.0      201.0      184.0      196.0      224.0   



date   2017-04-20  weekdiffernce  
region                            
NE           50.0    50.0 - 20.0  
NW           38.0    38.0 - 39.0  
SW          141.0  141.0 - 137.0  
All         229.0  229.0 - 196.0  

计算上周和一周抵消:

last_column = pd.to_datetime(df_sales.iloc[:,-1].name[2])

last_week_column = last_column + pd.DateOffset(周= -1)

col_mask = (pd.to_datetime(df_sales.columns.get_level_values(2)).weekofyear == (last_column.weekofyear-1))    

<击> df_sales.loc [:,(&#39;和&#39;&#39;销售&#39;&#39; weekdiffernce&#39)] = df_sales.iloc [:, - 1 ] .astype(str)+&#39; - &#39; + df_sales.loc [:,(&#39; sum&#39;,&#39; sales&#39;,last_week_column.strftime(&#39;%Y-%m-%d&#39; ))]。astype(STR)

df_sales.loc[:,('sum','sales','weekdiffernce')]=df_sales.iloc[:,-1].astype(str) + ' - '+df_sales.loc[:,('sum','sales',list(col_mask))].iloc[:,-1].astype(str)

print(df_sales)

输出:

              sum                                                         \
            sales                                                          
date   2017-03-30 2017-04-05 2017-04-07 2017-04-12 2017-04-13 2017-04-17   
region                                                                     
NE           10.0       20.0       30.0       12.0       20.0       30.0   
NW           31.0       33.0       31.0       35.0       39.0       49.0   
SW          133.0      135.0      140.0      137.0      137.0      145.0   
All         174.0      188.0      201.0      184.0      196.0      224.0   



date   2017-04-20  weekdiffernce  
region                            
NE           50.0    50.0 - 20.0  
NW           38.0    38.0 - 39.0  
SW          141.0  141.0 - 137.0  
All         229.0  229.0 - 196.0