实现train_test_split时的ValueError

时间:2018-05-11 21:23:09

标签: python machine-learning kaggle

我正在浏览Kaggle上的机器学习教程,尽管我已经逐行完成了教程,但我有一个 override func viewDidAppear(_ animated:bool) { super.viewDidAppear(animetd:animated) if UserDefaults.standard.bool(forKey:"presentDetails") { let vc = self.storyboard?.instantiateViewController(withIdentifier: "detailsID") as! ProductDetailController self.navigationController?.pushViewController(vc, animated: true) UserDefaults.standard.set(false,forKey:"presentDetails") } } 。我试图通过拆分来练习数据验证。这是我的代码:

ValueError

错误指向此行:

import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split


main_file_path = '../input/train.csv'
data = pd.read_csv(main_file_path)

y = data.SalePrice

data_predictors = ['LotArea', 'YearBuilt', '1stFlrSF', '2ndFlrSF', 'FullBath', 'BedroomAbvGr', 'TotRmsAbvGrd']
x = data[data_predictors]

train_x, val_x, train_y, val_x = train_test_split(x, y,random_state = 0)

data_model = DecisionTreeRegressor()
data_model.fit(train_x,train_y)

data_prediction = data_model.predict(val_x)
print(mean_absolute_error(val_y, data_prediction))

我是ML学习的初学者,所以我将我的代码与作者的代码进行了比较,实现是相同的。

完整堆栈跟踪:

data_prediction = data_model.predict(val_x)

1 个答案:

答案 0 :(得分:2)

虽然错误来自您指出的行,但实际问题在于此行:

    train_x, val_x, train_y, val_x = train_test_split(x, y,random_state = 0)

请注意,您有两个val_x。第二个val_x应为val_y。发生了什么,你设置val_x,它应该是一个2-D输入数组,应该是{1}}值是1-D预测数组 - 从而得到ValueError说你输入预期有二维阵列的一维阵列。