如何解释绘图结果普通Ridge回归(检查EDIT)

时间:2018-07-14 10:45:34

标签: python plot regression

我正在尝试可视化我的回归结果(我第一次做回归是仁慈的)。 This是我要遵循的示例,但它会造成混乱的情节(我以面值取这个示例)。我有两个问题。一种是可视化。另一个是尝试绘制x_validate Vs时出现的错误。 y_validate。我得到ValueError: x and y must be the same size,但他们俩都有56 rows。这是代码:

# This is where I create the three parts
bow.fillna(0, inplace=True)
x_train, x_validate, x_test = np.split(bow.sample(frac=1), [int(.6*len(bow)), int(.8*len(bow))])
y_train = x_train['Rating']
y_validate = x_validate['Rating']
y_test = x_test['Rating']
x_train.drop('Rating', 1, inplace=True)
x_validate.drop('Rating', 1, inplace=True)
x_test.drop('Rating', 1, inplace=True) 

# This is the regression part
regr = m.OrdinalRidge()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_validate)

# This is the plotting
plt.scatter(x_validate, y_validate,  color='black')  # <-- Here is where I get the error
plt.plot(x_validate, y_pred, color='blue', linewidth=1)
plt.xticks(())
plt.yticks(())
plt.show()

x_validate如下所示:

enter image description here

y_validate如下所示:

enter image description here

y_pred如下所示:

enter image description here

上面的.size是:

x_validate-> 3976(但有56 rows71 columns

y_validate-> 56

y_pred-> 56

任何帮助将不胜感激。


编辑:

这是Ach113建议的代码:

pca = PCA(n_components = 1) # the n will be 2 here since y in your case has 2 columns
pca.fit(x_validate)
x_validate = pca.transform(x_validate)

plt.scatter(x_validate, y_validate,  color='black')
plt.plot(x_validate, y_pred, color='blue', linewidth=1)

plt.show()

这就是结果图的样子:

enter image description here

如何解释回归是否良好?我有点迷路了...

1 个答案:

答案 0 :(得分:1)

要素的尺寸必须与输出的尺寸匹配,以便绘图有效。 X和Y的尺寸必须完全匹配,为此,您将必须使用PCA减小X的尺寸:

from sklearn.decomposition import PCA

pca = PCA(n_components = n) # the n will be 2 here since y in your case has 2 columns
pca.fit(x_train)
x_train = pca.transform(x_train)