我作为python / R新手正在关注以下博客,并且在向以下代码添加循环语句时遇到了麻烦。目前,我能够使代码完整运行,但仅输出1位客户的季节性标志。我希望它为我所有的客户循环运行。
datamovesme.com/2018/07/01/seasonality-python-code
##Here comes the R code piece
try:
seasonal = r('''
fit<-tbats(customerTS, seasonal.periods = 12, use.parallel = TRUE)
fit$seasonal
''')
except: seasonal = 1
seasonal_output = seasonal_output.append({'customer_id':customerid, 'seasonal': seasonal}, ignore_index=True)
print(f' {customerid} | {seasonal} ')
print(seasonal_output)
seasonal_output.to_csv(outfile)
我尝试了多种代码组合来使其循环,因此在此未列出太多。该博客显示了我们可以使用的现有数据帧和时间序列对象。我不确定要使用哪一个,以及如何将其传递给R代码。
谢谢!
答案 0 :(得分:1)
博客链接包含以下问题:
代码未按Python语法的要求正确缩进行。可能是由于网站呈现空白或制表符,但由于缺少缩进更改输出,这对读者不利。
代码未能解决附加数据帧Never call DataFrame.append or pd.concat inside a for-loop. It leads to quadratic copying的效率低下的问题。取而代之的是,由于 seasonal 是一个值,因此可以构建一列字典,然后将其放入循环外的pd.DataFrame()
构造函数中。
解决了上述问题并运行了整个代码块之后,您的解决方案应在所有 customerids 中输出数据帧。
# ... same above assignments ...
outfile = '[put your file path here].csv'
df_list = []
for customerid, dataForCustomer in filledIn.groupby(by=['customer_id']):
startYear = dataForCustomer.head(1).iloc[0].yr
startMonth = dataForCustomer.head(1).iloc[0].mnth
endYear = dataForCustomer.tail(1).iloc[0].yr
endMonth = dataForCustomer.tail(1).iloc[0].mnth
#Creating a time series object
customerTS = stats.ts(dataForCustomer.usage.astype(int),
start=base.c(startYear,startMonth),
end=base.c(endYear, endMonth),
frequency=12)
r.assign('customerTS', customerTS)
##Here comes the R code piece
try:
seasonal = r('''
fit<-tbats(customerTS, seasonal.periods = 12, use.parallel = TRUE)
fit$seasonal
''')
except:
seasonal = 1
# APPEND DICTIONARY TO LIST (NOT DATA FRAME)
df_list.append({'customer_id': customerid, 'seasonal': seasonal})
print(f' {customerid} | {seasonal} ')
seasonal_output = pd.DataFrame(df_list)
print(seasonal_output)
seasonal_output.to_csv(outfile)