这是我的代码:
#Importing the dataset
dataset = pd.read_csv('insurance.csv')
X = dataset.iloc[:, :-2].values
X = pd.DataFrame(X)
#Encoding Categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder_X = LabelEncoder()
X[:, 1:2] = labelencoder_X.fit_transform(X[:, 1:2])
样本数据集
age sex bmi children smoker region charges
19 female 27.9 0 yes southwest 16884.924
18 male 33.77 1 no southeast 1725.5523
28 male 33 3 no southeast 4449.462
33 male 22.705 0 no northwest 21984.47061
32 male 28.88 0 no northwest 3866.8552
31 female 25.74 0 no southeast 3756.6216
46 female 33.44 1 no southeast 8240.5896
37 female 27.74 3 no northwest 7281.5056
37 male 29.83 2 no northeast 6406.4107
60 female 25.84 0 no northwest 28923.13692
运行labelencoder时,出现以下错误
文件“ E:\ Anaconda2 \ lib \ site-packages \ pandas \ core \ generic.py”,行 1840,在_get_item_cache中res = cache.get(item)TypeError:无法散列 输入
什么可能导致此错误?
答案 0 :(得分:0)
您的问题是您试图对切片进行标签编码。
重现错误的步骤:
df = pd.DataFrame({"score":[0,1],"gender":["male","female"]})
enc = LabelEncoder()
enc.fit_transform(df[:,1:2])
...
TypeError: unhashable type: 'slice'
请尝试正确地访问您的列,以便向LabelEncoder
提供类似数组类型的形状(n_samples):numpy数组,列表,pandas系列(请参见docs)。
证明:
enc.fit_transform(df["gender"])
array([1, 0])
最后,如果您想对df
进行突变,则可以执行以下操作:
for col in df.select_dtypes(include="object").columns:
df[col] = enc.fit_transform(df[col])
答案 1 :(得分:0)
这是一个小演示:
In [36]: from sklearn.preprocessing import LabelEncoder
In [37]: le = LabelEncoder()
In [38]: X = df.apply(lambda c: c if np.issubdtype(df.dtypes.loc[c.name], np.number)
else le.fit_transform(c))
In [39]: X
Out[39]:
age sex bmi children smoker region charges
0 19 0 27.900 0 1 3 16884.92400
1 18 1 33.770 1 0 2 1725.55230
2 28 1 33.000 3 0 2 4449.46200
3 33 1 22.705 0 0 1 21984.47061
4 32 1 28.880 0 0 1 3866.85520
5 31 0 25.740 0 0 2 3756.62160
6 46 0 33.440 1 0 2 8240.58960
7 37 0 27.740 3 0 1 7281.50560
8 37 1 29.830 2 0 0 6406.41070
9 60 0 25.840 0 0 1 28923.13692
源DF:
In [35]: df
Out[35]:
age sex bmi children smoker region charges
0 19 female 27.900 0 yes southwest 16884.92400
1 18 male 33.770 1 no southeast 1725.55230
2 28 male 33.000 3 no southeast 4449.46200
3 33 male 22.705 0 no northwest 21984.47061
4 32 male 28.880 0 no northwest 3866.85520
5 31 female 25.740 0 no southeast 3756.62160
6 46 female 33.440 1 no southeast 8240.58960
7 37 female 27.740 3 no northwest 7281.50560
8 37 male 29.830 2 no northeast 6406.41070
9 60 female 25.840 0 no northwest 28923.13692