决策树分类器不接受分类特征

时间:2019-11-01 10:28:36

标签: pandas scikit-learn decision-tree

我有一个信用评分数据集,需要对客户是否会违约进行分类。

LIMIT_BAL  gender EDUCATION MARRIAGE    AGE SEP_STATUS  AUG_STATUS  JUL_STATUS  JUN_STATUS  MAY_STATUS  ... JUN_BAL MAY_BAL APR_BAL SEP_PAID    AUG_PAID    JUL_PAID    JUN_PAID    MAY_PAID    APR_PAID    default_0
0   20000   female  bachelor    married 24  2 mo    2 mo    paid    paid    no need to pay  ... 0   0   0   0   689 0   0   0   0   bad
1   90000   female  bachelor    single  34  using credit    using credit    using credit    using credit    using credit    ... 14331   14948   15549   1518    1500    1000    1000    1000    5000    good

dec_class= DecisionTreeClassifier(random_state=17)
y = df['default_0']
x = df.iloc[:, :-1]

X_train, X_test, y_train, y_test = train_test_split(x,y,test_size=0.3,random_state=17)

dec_class.fit(x,y)
could not convert string to float: 'female'

我认为决策树在分类和数值特征上都可以很好地工作。我已经将分类特征预处理为单词,之前它们都是数字。 为什么不接受与词相同的分类特征:性别-“男”,“女”?

0 个答案:

没有答案