Numpy错误“无法将字符串转换为浮点数:'Illinois'”

时间:2017-12-21 02:58:05

标签: python python-3.x numpy

我在Google表格中创建了下表,并将其下载为CSV文件。

enter image description here

我的代码发布在下面。我真的不确定它在哪里失败了。我试图逐行突出显示并运行代码,并不断抛出该错误。

# Data Preprocessing

# Import Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Import Dataset
dataset = pd.read_csv('Data2.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values

# Replace Missing Values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(X[:, 1:5 ])
X[:, 1:6] = imputer.transform(X[:, 1:5])

我得到的错误是:

Could not convert string to float: 'Illinois'

我的错误信息

上面也有这一行
array = np.array(array, dtype=dtype, order=order, copy=copy)

好像我的代码无法读取包含浮点数的GPA列。也许我没有正确创建该列并且必须指定它们是浮点数?

***我正在使用完整的错误消息进行更新:

     [15]: runfile('/Users/jim/Desktop/Machine Learning Class/Part 1/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing/data_preprocessing_template2.py', wdir='/Users/jim/Desktop/Machine Learning Class/Part 1/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing')
Traceback (most recent call last):

  File "<ipython-input-15-5f895cf9ba62>", line 1, in <module>
    runfile('/Users/jim/Desktop/Machine Learning Class/Part 1/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing/data_preprocessing_template2.py', wdir='/Users/jim/Desktop/Machine Learning Class/Part 1/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing')

  File "/Users/jim/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 710, in runfile
    execfile(filename, namespace)

  File "/Users/jim/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 101, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/Users/jim/Desktop/Machine Learning Class/Part 1/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing/data_preprocessing_template2.py", line 16, in <module>
    imputer = imputer.fit(X[:, 1:5 ])

  File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/imputation.py", line 155, in fit
    force_all_finite=False)

  File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 433, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)

ValueError: could not convert string to float: 'Illinois'

2 个答案:

答案 0 :(得分:1)

实际上你得到的完整错误就是这个(如果你把它全部粘贴的话会有很大的帮助):

Traceback (most recent call last):

  File "<ipython-input-7-6a92ceaf227a>", line 8, in <module>
    imputer = imputer.fit(X[:, 1:5 ])

  File "C:\Users\Fatih\Anaconda2\lib\site-packages\sklearn\preprocessing\imputation.py", line 155, in fit
    force_all_finite=False)

  File "C:\Users\Fatih\Anaconda2\lib\site-packages\sklearn\utils\validation.py", line 433, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)

ValueError: could not convert string to float: Illinois

如果仔细观察,请指出它失败的地方:

imputer = imputer.fit(X[:, 1:5 ])

这是由于你努力取一个分类变量的意思,这是没有意义的,

已经询问并回答了in this StackOverflow thread.

答案 1 :(得分:-1)

更改行:

dataset = pd.read_csv('Data2.csv')

by:

dataset = pd.read_csv('Data2.csv', delimiter=";")