如何使用pandas读取Boston数据集中的所有列?

时间:2017-11-07 15:18:11

标签: python pandas

以下是我的代码:

df = pd.read_csv('boston_X.csv', sep = ';')
df1 = pd.read_csv('boston_y.csv', sep = ';')

#Split the data into training and testing sets

Features = df[list(df.columns)[:-1]]
print Features.shape
Quality = df1['label']

Features_train, Features_test, Quality_train, Quality_test = train_test_split(
    Features, Quality)

我有属性和类标签的单独文件。来自boston_X.csv的属性的所有值都应该存储在Features中。但我一直收到这个错误。这是一个追溯:

  "This module will be removed in 0.20.", DeprecationWarning)
(506, 0)
Traceback (most recent call last):
  File "Q3_BostonData.py", line 25, in <module>
    regressor.fit(Features_train, Quality_train)
  File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 512, in fit
    y_numeric=True, multi_output=True)
  File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
  File "/home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 424, in check_array
    context))
ValueError: Found array with 0 feature(s) (shape=(379, 0)) while a minimum of 1 is required.

(506,0)是特征的形状。 以下是boston_X.csv

       x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13
0  0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,...
1  0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242...
2  0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242...
3  0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222...
4  0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222...
5  0.02985,0,2.18,0,0.458,6.43,58.7,6.0622,3,222,...
6  0.08829,12.5,7.87,0,0.524,6.012,66.6,5.5605,5,...
7  0.14455,12.5,7.87,0,0.524,6.172,96.1,5.9505,5,...
8  0.21124,12.5,7.87,0,0.524,5.631,100,6.0821,5,3...
9  0.17004,12.5,7.87,0,0.524,6.004,85.9,6.5921,5,...

这就是boston_y.csv的样子。

 label
0   24.0
1   21.6
2   34.7
3   33.4
4   36.2
5   28.7
6   22.9
7   27.1
8   16.5
9   18.9

1 个答案:

答案 0 :(得分:0)

我只是添加了以下语句,指定了每个属性/列名称。

df =  pd.read_csv("boston_X.csv", usecols=['x1','x2','x3','x4','x5','x6','x7','x8','x9','x10','x11','x12','x13'])

现在它运作良好。