我有这个估算的DataFrame:
Imputed_Df.head():
Atmospheric_Pressure Global_Radiation Net_Radiation Precipitation Relative_Humidity Temperature Wind_Direction Wind_Speed
Time
2013-11-01 01:00:00 999.451 207.75 99.09 4.450000 39.958667 13.600000 117.231667 2.138500
2013-11-01 05:00:00 992.760 167.77 85.16 5.746667 56.107500 11.900000 244.410000 2.313000
2013-11-01 09:00:00 990.272 157.00 95.04 6.271000 37.113333 12.802083 297.131500 3.270350
2013-11-01 10:00:00 998.367 191.26 82.32 4.428000 37.946500 13.800000 143.103333 2.232500
我想要做的基本上是对所有列进行平滑处理,然后将新的平滑列添加到此DataFrame中,所以这是我尝试做的事情:
import statsmodels
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt
def Smoothing(Col):
for Col in Imputed_Df.columns:
fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
fcast = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
return fcast
Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Imputed_Df.apply(Smoothing, axis=1)
但是我得到了这个错误:
Columns must be same length as key
任何建议都会受到赞赏,谢谢。
答案 0 :(得分:0)
假设,您的Smoothing Function
是Correct
print(Imputed_Df.apply(Smoothing, axis=1))
并检查count
中返回的列中的df
,该列应与8
相匹配
Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']]
如果输出df列数不是8,则尝试
import statsmodels
from statsmodels.tsa.holtwinters import SimpleExpSmoothing
def Smoothing(Imputed_Df):
my_df = Imputed_Df.copy()
for Col in Imputed_Df.columns:
fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
my_df[Col] = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
return my_df
实际上我发现时间序列数据不规则。
Imputed_Df = Imputed_Df.resample('H').pad() ##
Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Smoothing(Imputed_Df)
我宁愿这样写
Imputed_Df[Imputed_Df.columns + "_SES"]= Smoothing(Imputed_Df)