以下是我的代码:
df = pd.read_csv('boston_X.csv', sep = ';')
df1 = pd.read_csv('boston_y.csv', sep = ';')
#Split the data into training and testing sets
Features = df[list(df.columns)[:-1]]
print Features.shape
Quality = df1['label']
Features_train, Features_test, Quality_train, Quality_test = train_test_split(
Features, Quality)
我有属性和类标签的单独文件。来自boston_X.csv的属性的所有值都应该存储在Features中。但我一直收到这个错误。这是一个追溯:
"This module will be removed in 0.20.", DeprecationWarning)
(506, 0)
Traceback (most recent call last):
File "Q3_BostonData.py", line 25, in <module>
regressor.fit(Features_train, Quality_train)
File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 512, in fit
y_numeric=True, multi_output=True)
File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 424, in check_array
context))
ValueError: Found array with 0 feature(s) (shape=(379, 0)) while a minimum of 1 is required.
(506,0)是特征的形状。 以下是boston_X.csv
x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13
0 0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,...
1 0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242...
2 0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242...
3 0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222...
4 0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222...
5 0.02985,0,2.18,0,0.458,6.43,58.7,6.0622,3,222,...
6 0.08829,12.5,7.87,0,0.524,6.012,66.6,5.5605,5,...
7 0.14455,12.5,7.87,0,0.524,6.172,96.1,5.9505,5,...
8 0.21124,12.5,7.87,0,0.524,5.631,100,6.0821,5,3...
9 0.17004,12.5,7.87,0,0.524,6.004,85.9,6.5921,5,...
这就是boston_y.csv的样子。
label
0 24.0
1 21.6
2 34.7
3 33.4
4 36.2
5 28.7
6 22.9
7 27.1
8 16.5
9 18.9
答案 0 :(得分:0)
我只是添加了以下语句,指定了每个属性/列名称。
df = pd.read_csv("boston_X.csv", usecols=['x1','x2','x3','x4','x5','x6','x7','x8','x9','x10','x11','x12','x13'])
现在它运作良好。