尝试使用SMOTE创建平衡火车时出错

时间:2018-10-09 06:43:31

标签: r classification training-data

我正在尝试创建新数据以接收平衡的训练集,以进行决策树分类。使用 SMOTE 功能时,总是出现相同的错误:

  

names(dn)中的错误<-dnn:尝试将属性设置为NULL   另外:警告消息:在名称(数据)== as.character(form [[2]])中:   较长的物体长度不是较短的物体长度的倍数

我用as.factor()将所有内容转换为因数,并删除了NA:

train <- na.omit(train)



> str(train)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   11526 obs. of  5 variables:
 $ number: Factor w/ 2 levels "problem",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Land: Factor w/ 29 levels "Australien","Belgien",..: 3 3 3 3 3 3 3 3 9 3 ...
 $ direction: Factor w/ 2 levels "LL","RL": 1 1 1 1 1 1 1 1 2 1 ...
 $ transmission: Factor w/ 2 levels "AUT","SCH": 1 1 1 1 1 1 1 1 1 1 ...
 $ range: Factor w/ 4 levels "1","2","3","4": 3 3 3 2 1 3 2 4 3 2 ...
 - attr(*, "na.action")= 'omit' Named int  6500 9748
  ..- attr(*, "names")= chr  "6500" "9748"

我的火车头像这样:

> head(train,10)
     number          Land           direction  transmission       range 
1  reference Bundesrep. Deutschland      LL         AUT             3
2  reference Bundesrep. Deutschland      LL         AUT             3
3  reference Bundesrep. Deutschland      LL         AUT             3
4  reference Bundesrep. Deutschland      LL         AUT             2
5  reference Bundesrep. Deutschland      LL         AUT             1
6  reference Bundesrep. Deutschland      LL         AUT             3
7  reference Bundesrep. Deutschland      LL         AUT             2
8  problem                   Taiwan      LL         AUT             3
9  reference Bundesrep. Deutschland      LL         AUT             4
10 reference        Grossbritannien      RL         AUT             3
11 reference Bundesrep. Deutschland      LL         SCH             2

这是我的代码:

library(DMWr)
smote_train <- SMOTE(train$number ~ ., data  = train, perc.over=500, k =5)

0 个答案:

没有答案