Date, Brand, Indication,Geo, Type and values are column names 目前使用函数和日期时间戳计算滚动季度,下面使用的代码如下所示,执行代码需要一些时间来改变或修改代码RQ column is the rolling quarter column added ..
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
import datetime
#***Date parsing using datetime.stptime function***
dateparse = lambda x: pd.datetime.strptime(x, '%m/%d/%Y')
df = pd.read_csv('Demo for MAt.csv', index_col=0,
parse_dates=['Date'], date_parser=dateparse)
## importing data from csv file as dataframe
#Function to calculate the rolling sum for each record by date and other
levels
def RQ(x):
##Function checks whether the date is falling in the previous 3 months range
##and sums up if it is in the range**
RQS = df['Value'][
(df.index >= x.name - datetime.timedelta(days = 62))
& (df.index <= x.name)
& (df['Brand'] == x[0])
& (df['Indication'] == x[1])
& (df['Geo'] == x[2])
& (df['Type'] == x[3])
]
return RQS.sum()
##For each row the calculation is done using the apply function**
df['RQ'] = df.apply(RQ, axis=1)
#the below data frames has the input and expected output for a sample
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
inputdf = pd.DataFrame([['04/01/2016', 1,'A','National','Value',10],
['05/01/2016', 1,'A','National','Value',20], ['06/01/2016',
1,'A','National','Value',30]], columns=['Date',
'Brand','Indication','Geo','Type','Value'])
print inputdf
outputdf = pd.DataFrame([['04/01/2016', 1,'A','National','Value',10,10],
['05/01/2016', 1,'A','National','Value',20,30], ['06/01/2016',
1,'A','National','Value',30,60]], columns=['Date',
'Brand','Indication','Geo','Type','Value','RQ'])
print outputdf
#Input**Below input**
Date Brand Indication Geo Type Value
0 04/01/2016 1 A National Value 10
1 05/01/2016 1 A National Value 20
2 06/01/2016 1 A National Value 30
## Expected output
Date Brand Indication Geo Type Value RQ
0 04/01/2016 1 A National Value 10 10
1 05/01/2016 1 A National Value 20 30
2 06/01/2016 1 A National Value 30 60
答案 0 :(得分:0)
将Date
列转换为时间戳类型,如果尚未完成&amp;将其设为索引
df.Date = pd.to_datetime(df.Date)
df = df.set_index('Date')
使用其他维度对数据进行分组,并为每个组应用值的滚动总和。
DataFrame.rolling可以创建时间窗口,默认使用索引进行窗口化。如您在尝试中所做的那样,为窗口大小指定62D
。
df['RQ'] = df.groupby(list(df.columns[:-1].values)).Value.apply(lambda x: x.rolling('62D').sum())
此输出(带有样本数据):
Brand Indication Geo Type Value RQ
Date
2016-04-01 1 A National Value 10 10.0
2016-05-01 1 A National Value 20 30.0
2016-06-01 1 A National Value 30 60.0