我正在创建具有四个变量的预测机。当我添加变量时,一切都弄乱了,给了我:
ValueError: shapes (1,4) and (5,4) not aligned: 4 (dim 1) != 5 (dim 0)
代码
import pandas as pd
from pandas import DataFrame
from sklearn import linear_model
import tkinter as tk
import statsmodels.api as sm
#方法1:将数据导入Python
Stock_Market = pd.read_csv(r'Training_Nis_New2.csv')
df = DataFrame(Stock_Market,columns=['Month 1','Month 2','Month 3','Month
4','Month 5','Month 6','Month 7','Month 8',
'Month 9','Month 10','Month 11','Month
12','FSUTX','MMUKX','FUFRX','RYUIX','Interest R','Housing
Sale','Unemployement Rate','Conus Average Temperature
Rank','30FSUTX','30MMUKX','30FUFRX','30RYUIX'])
X = df[['Month 1','Interest R','Housing Sale','Unemployement Rate','Conus Average Temperature Rank']]
# here we have 2 variables for multiple regression. If you just want to use one variable for simple linear regression, then use X = df['Interest_Rate'] for example.Alternatively, you may add additional variables within the brackets
Y = df[['30FSUTX','30MMUKX','30FUFRX','30RYUIX']]
# with sklearn
regr = linear_model.LinearRegression()
regr.fit(X, Y)
print('Intercept: \n', regr.intercept_)
print('Coefficients: \n', regr.coef_)
# prediction with sklearn
# prediction with sklearn
HS=5.5
UR=6.7
CATR=8.9
New_Interest_R = 4.6
print('Predicted Stock Index Price: \n', regr.predict([[UR ,HS ,CATR
,New_Interest_R]]))
# with statsmodel
X = df[['Month 1','Interest R','Housing Sale','Unemployement Rate','Conus Average Temperature Rank']]
Y = df['30FSUTX']
print('\n\n*** Fund = FSUTX')
X = sm.add_constant(X) # adding a constant
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print_model = model.summary()
print(print_model)