如何运行OLS回归与pandas datetime对象系列是独立的值(x)

时间:2016-06-21 22:11:05

标签: python pandas statsmodels

我想,我错过了一些基本的东西。但是,这是我的问题。我有25天的时间序列数据,我想运行OLS回归,其中Y值是时间序列,X值是日期时间索引。这是我的代码

Exception: Invalid RHS type: <class 'pandas.tseries.index.DatetimeIndex'>

但是,我得到以下错误,因为索引是datetime对象

library(raster)
library(lmtest)

r <- raster(ncol=10, nrow=10)
r[]=1:ncell(r)
S <- stack(r,r,r,r,r,r,r,r,r,r,r,r)
R <- stack(r,r,r,r,r,r,r,r,r,r,r,r)
FNO2<-stack(S,R)

1 个答案:

答案 0 :(得分:0)

将此添加到您的代码中:

df['jDate'] = df.index.to_julian_date()
ols(x=df.jDate, y=df.abc)

完整看起来像:

import pandas as pd
from pandas.stats.api import ols

indx = [Timestamp('2015-06-01 00:00:00'), Timestamp('2015-06-02 00:00:00'), Timestamp('2015-06-03 00:00:00'), Timestamp('2015-06-04 00:00:00'), Timestamp('2015-06-05 00:00:00'), Timestamp('2015-06-06 00:00:00'), Timestamp('2015-06-07 00:00:00'), Timestamp('2015-06-08 00:00:00'), Timestamp('2015-06-09 00:00:00'), Timestamp('2015-06-10 00:00:00'), Timestamp('2015-06-11 00:00:00'), Timestamp('2015-06-12 00:00:00'), Timestamp('2015-06-13 00:00:00'), Timestamp('2015-06-14 00:00:00'), Timestamp('2015-06-15 00:00:00'), Timestamp('2015-06-16 00:00:00'), Timestamp('2015-06-17 00:00:00'), Timestamp('2015-06-18 00:00:00'), Timestamp('2015-06-19 00:00:00'), Timestamp('2015-06-20 00:00:00'), Timestamp('2015-06-21 00:00:00'), Timestamp('2015-06-22 00:00:00'), Timestamp('2015-06-23 00:00:00'), Timestamp('2015-06-24 00:00:00'), Timestamp('2015-06-25 00:00:00')]
col = [51.219999999999999, 51.189999999999998, 51.210000000000001, 51.229999999999997, 51.219999999999999, 51.219999999999999, 51.219999999999999, 51.229999999999997, 51.240000000000002, 51.219999999999999, 51.200000000000003, 51.200000000000003, 51.200000000000003, 51.219999999999999, 51.219999999999999, 51.219999999999999, 51.219999999999999, 51.270000000000003, 51.280000000000001, 51.280000000000001, 51.299999999999997, 51.299999999999997, 51.280000000000001, 51.280000000000001, 51.270000000000003]
df = pd.DataFrame(col,index=indx,columns=['abc'])
df['jDate'] = df.index.to_julian_date()
sumstat = ols(x=df.jDate, y=df.abc)