我正在尝试可视化我的回归结果(我第一次做回归是仁慈的)。 This是我要遵循的示例,但它会造成混乱的情节(我以面值取这个示例)。我有两个问题。一种是可视化。另一个是尝试绘制x_validate
Vs时出现的错误。 y_validate
。我得到ValueError: x and y must be the same size
,但他们俩都有56 rows
。这是代码:
# This is where I create the three parts
bow.fillna(0, inplace=True)
x_train, x_validate, x_test = np.split(bow.sample(frac=1), [int(.6*len(bow)), int(.8*len(bow))])
y_train = x_train['Rating']
y_validate = x_validate['Rating']
y_test = x_test['Rating']
x_train.drop('Rating', 1, inplace=True)
x_validate.drop('Rating', 1, inplace=True)
x_test.drop('Rating', 1, inplace=True)
# This is the regression part
regr = m.OrdinalRidge()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_validate)
# This is the plotting
plt.scatter(x_validate, y_validate, color='black') # <-- Here is where I get the error
plt.plot(x_validate, y_pred, color='blue', linewidth=1)
plt.xticks(())
plt.yticks(())
plt.show()
x_validate
如下所示:
y_validate
如下所示:
y_pred
如下所示:
上面的.size
是:
x_validate
-> 3976(但有56 rows
和71 columns
)
y_validate
-> 56
y_pred
-> 56
任何帮助将不胜感激。
编辑:
这是Ach113
建议的代码:
pca = PCA(n_components = 1) # the n will be 2 here since y in your case has 2 columns
pca.fit(x_validate)
x_validate = pca.transform(x_validate)
plt.scatter(x_validate, y_validate, color='black')
plt.plot(x_validate, y_pred, color='blue', linewidth=1)
plt.show()
这就是结果图的样子:
如何解释回归是否良好?我有点迷路了...
答案 0 :(得分:1)
要素的尺寸必须与输出的尺寸匹配,以便绘图有效。 X和Y的尺寸必须完全匹配,为此,您将必须使用PCA减小X的尺寸:
from sklearn.decomposition import PCA
pca = PCA(n_components = n) # the n will be 2 here since y in your case has 2 columns
pca.fit(x_train)
x_train = pca.transform(x_train)