以下代码应该对分类数据进行编码,但它会引发错误。 X是一个由3列组成的矩阵,其中第一列(索引0)是一个分类变量,我试图进行一次热编码。谢谢!
#Importing dataset
#Importing csv with pandas
dataset = pd.read_csv("Data.csv")
#Creating our matrix of independent variables (X)
X = dataset.iloc[:, :-1].values
#Creating the dependent variable vector (y)
y = dataset.iloc[:, 3]
缺少值
#Dealing with missing values
#import Imputer class from sklearn
from sklearn.preprocessing import Imputer
#create an object from the Imputer class
imputer = Imputer(missing_values = "NaN", strategy = 'mean')
#fit 'imputer' object to our independent variable matrix X
imputer.fit(X[:, 1:3])
#Updating our matrix X using transform method
X[:, 1:3] = imputer.transform(X[:, 1:3])
一个热门编码
#import LabelEncoder, OneHotEncoder classes from sklearn
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
#create an object from the LabelEncoder class
labelencoder_X = LabelEncoder()
#update our matrix X with encoded values
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
#create an object from OneHotEncoder class
onehotencoder = OneHotEncoder(categorical_features = [0])
#fit 'onehotencoder' object to our first column
X = onehotencoder.fit_transform(X).toarray()
错误(在一个热编码中):
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-58-2a8dcc36489d> in <module>()
8
9 #update our matrix X with encoded values
---> 10 X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
11
12 #create an object from OneHotEncoder class
TypeError: 'method' object is not subscriptable