不可分类型:str()>浮动错误KNN模型

时间:2016-12-02 18:24:24

标签: python-3.x machine-learning compiler-errors

我已经阅读了很多关于此特定错误的内容,但未能找到解决我的问题的答案。我有一个数据集,我已分成火车和测试集,我正在寻找运行KNeighborsClassifier。我的代码如下......我的问题是,当我查看我的X_train的dtypes时,我根本看不到任何字符串格式的列。我的y_train是一个分类变量。这是我的第一篇stackoverflow帖子,所以我很抱歉,如果我忽略了任何手续并感谢您的帮助! :)

错误:

TypeError: unorderable types: str() > float()

Dtypes:

X_train.dtypes.value_counts()
Out[54]: 
int64      2035
float64     178
dtype: int64

代码:

# Import Packages 
import os
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
from sklearn.dummy import DummyRegressor
from sklearn.cross_validation import train_test_split, KFold
from matplotlib.ticker import FormatStrFormatter
from sklearn import cross_validation
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
import pdb

# Set Directory Path 
path = "file_path"
os.chdir(path)

#Select Import File
data = 'RawData2.csv' 
delim = ','

#Import Data File
df = pd.read_csv(data, sep = delim)
print (df.head())

df.columns.get_loc('Categories')

#Model 

#Select/Update Features
X = df[df.columns[14:2215]]

#Get Column Index for Target Variable
df.columns.get_loc('Categories')

#Select Target and fill na's with "Small" label
y = y[y.columns[21]]
print(y.values)
y.fillna('Small')

#Training/Test Set
X_sample = X.loc[X.Var1 <1279]
X_valid = X.loc[X.Var1 > 1278]
y_sample = y.head(len(X_sample))
y_test = y.head(len(y)-len(X_sample))

X_train, X_test, y_train, y_test = train_test_split(X_sample, y_sample, test_size = 0.2)
cv = KFold(n = X_train.shape[0], n_folds = 5, random_state = 17)

print(X_train.shape, y_train.shape)
X_train.dtypes.value_counts()

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

knn = KNeighborsClassifier(n_neighbors = 5)
knn.fit(X_train, y_train) **<-- This is where the error is flagged** 
accuracy_score(knn.predict(X_test))

1 个答案:

答案 0 :(得分:0)

sklearn中的所有内容都基于numpy,它只使用数字。因此,分类X和Y需要编码为数字。对于x,您可以使用get_dummies。对于y,您可以使用LabelEncoder。

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html