在sklearn.preprocessing模块中,我得到ValueError:找到具有0个功能的数组

时间:2018-06-20 16:35:04

标签: python python-3.x scikit-learn sklearn-pandas

我看到很多问题都有此错误,但是我无法理解与代码或问题的关系。

我正在尝试修复从互联网上找到的示例CSV文件获得的数据中的NaN值。我的代码实际上很简单:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Importing stuff.
from sklearn.preprocessing import Imputer
import pandas

# Loading the corrupt data
corrupt_data = pandas.read_csv('SampleCorruptData.csv')

#Creating Imputer object
imputer = Imputer(missing_values = 'NaN', strategy= "mean", axis = 0)

owner_id = corrupt_data.iloc[:,2:]

print(owner_id)

imputer = imputer.fit(owner_id.iloc[:,2:])

owner_id.iloc[:,2:] = imputer.transform(owner_id[:,2:])

print(owner_id)

CSV文件:

GroupName,Groupcode,GroupOwner
System Administrators,sysadmin,13456
Independence High Teachers,HS Teachers,
John Glenn Middle Teachers,MS Teachers,13458
Liberty Elementary Teachers,Elem Teachers,13559
1st Grade Teachers,1stgrade,NaN
2nd Grade Teachers,2nsgrade,13561
3rd Grade Teachers,3rdgrade,13562
Guidance Department,guidance,NaN
Independence Math Teachers,HS Math,13660
Independence English Teachers,HS English,13661
John Glenn 8th Grade Teachers,8thgrade,
John Glenn 7th Grade Teachers,7thgrade,13452
Elementary Parents,Elem Parents,NaN
Middle School Parents,MS Parents,18001
High School Parents,HS Parents,18002

您会看到NaN值。

我得到的错误:

Traceback (most recent call last):

  File "<ipython-input-21-1bfc8eb216cc>", line 1, in <module>
    runfile('/home/teoman/Desktop/data science/Fix Corrupt Data/imputation.py', wdir='/home/teoman/Desktop/data science/Fix Corrupt Data')

  File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/teoman/Desktop/data science/Fix Corrupt Data/imputation.py", line 18, in <module>
    imputer = imputer.fit(owner_id.iloc[:,2:])

  File "/home/teoman/.local/lib/python3.5/site-packages/sklearn/preprocessing/imputation.py", line 155, in fit
    force_all_finite=False)

  File "/home/teoman/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 470, in check_array
    context))

ValueError: Found array with 0 feature(s) (shape=(15, 0)) while a minimum of 1 is required.

在这里我该怎么办?

1 个答案:

答案 0 :(得分:3)

如果我们跟踪您的错误,我们可以找到解决方法

您的错误是:

  

ValueError:找到的数组具有0个特征(shape =(15,0)),而最少需要1个。

基本上,它正在寻找至少1个功能。如果我们看docs of imputer: 参数: X:形状为[n_samples, n_features ]

的numpy数组

在您的情况下,您有15个n_samples和0个n_features 如果转换数据并使n_features> 0,则将解决您的问题。

保留在开采的1D numpy数组中将返回0列,如果使用numpy.reshape()函数对其进行整形或将其转换为pd.DataFrame,则可以获得1个n_features。

我希望这对您有帮助

谢谢