确切的错误是在行上发生的“ ValueError:标签数= 21与样本数= 82不匹配”
rf.fit(train_X, train_y)
代码如下:
df = pd.read_csv('C:\\git\\MetalRater\\Metal Sheet 2 - Sheet1 - TEST.csv', encoding="ISO-8859-1")
# The x and y are defined (x = features, y = y)
features = ["Emotion", "Solid", "Variety", "Length (mins)"]
y = df["RL"]
train_X, test_X, train_y, test_y = train_test_split(df[features], y, test_size=0.2, random_state=0)
print(len(train_X))
print(len(train_y))
def find_n_estimators(train_X, train_y, test_X, test_y):
accuracy_forest_base = 0
for i in range(10, 1000, 10):
rf = RandomForestRegressor(random_state = 0, n_estimators = i)
rf.fit(train_X, train_y)
predictions_forest = rf.predict(test_X)
for i in range(len(predictions_forest)):
predictions_forest[i] = round(predictions_forest[i],0)
accuracy_forest = accuracy_score(test_y, predictions_forest)
if accuracy_forest > accuracy_forest_base:
accuracy_forest_base = accuracy_forest
n_est = i
else:
break
return n_est
打印语句确认两者的长度均为82。
编辑:按照下面的要求,我打印了以下内容:
print(np.shape(train_X)[0])
print(np.shape(train_y)[0])
结果分别为“ 82”和“()”。
答案 0 :(得分:0)
我相信您正在以错误的顺序调用带有参数的函数。 附言:我无法发表评论,所以我必须作答
答案 1 :(得分:0)