我有以下DataFrame,我称之为main_frame:
Value Value 1lag 2lag 3lag 4lag
Date
2005-04-01 0.824427 0.892308 1.000000 0.000000 0.000000 0.000000
2005-05-01 0.778626 0.953846 0.892308 1.000000 0.000000 0.000000
2005-06-01 0.717557 1.000000 0.953846 0.892308 1.000000 0.000000
2005-07-01 0.725191 0.000000 1.000000 0.953846 0.892308 1.000000
2005-08-01 0.717557 1.000000 0.000000 1.000000 0.953846 0.892308
2005-09-01 0.740458 0.861538 1.000000 0.000000 1.000000 0.953846
2005-10-01 0.732824 0.877193 0.861538 1.000000 0.000000 1.000000
2005-11-01 0.732824 1.000000 0.877193 0.861538 1.000000 0.000000
2005-12-01 0.641221 1.000000 1.000000 0.877193 0.861538 1.000000
2006-01-01 0.709924 0.614035 1.000000 1.000000 0.877193 0.861538
2006-02-01 0.770992 0.649123 0.614035 1.000000 1.000000 0.877193
我构建了以下模型:
predictor=main_frame.iloc[:,1:]
target=main_frame.iloc[:,0]
model=LinearRegression()
model.fit(X=predictor,y=target)
我知道要预测,我现在应该使用model.predict(),但是我很难理解预测函数的参数是如何工作的。我试图使用:
prediction=model.predict(target)
print predict
但这让我误以为,我相信我误解了与争论有关的事情。
如何设置预测工作的命令?
修改
我添加了Traceback
/Library/Python/2.7/site-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "/Users/file.py", line 61, in <module>
prediction=model.predict(target)
File "/Library/Python/2.7/site-packages/sklearn/linear_model/base.py", line 200, in predict
return self._decision_function(X)
File "/Library/Python/2.7/site-packages/sklearn/linear_model/base.py", line 185, in _decision_function
dense_output=True) + self.intercept_
File "/Library/Python/2.7/site-packages/sklearn/utils/extmath.py", line 184, in safe_sparse_dot
return fast_dot(a, b)
ValueError: shapes (1,127) and (144,) not aligned: 127 (dim 1) != 144 (dim 0)
编辑2
试着用其他的话来表达我的问题,以便更好地回答:
考虑到上述模型,我如何找出目标变量下一个时期的预测值?
答案 0 :(得分:2)
您将错误的参数传递给预测函数。试试这个:
^.*\d+_(.*)$
请注意,已使用“预测变量”变量训练模型。因此,您只能预测与“预测变量”变量具有完全相同列数的数据。