我正在尝试创建新数据以接收平衡的训练集,以进行决策树分类。使用 SMOTE 功能时,总是出现相同的错误:
names(dn)中的错误<-dnn:尝试将属性设置为NULL 另外:警告消息:在名称(数据)== as.character(form [[2]])中: 较长的物体长度不是较短的物体长度的倍数
我用as.factor()
将所有内容转换为因数,并删除了NA:
train <- na.omit(train)
> str(train)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 11526 obs. of 5 variables:
$ number: Factor w/ 2 levels "problem",..: 2 2 2 2 2 2 2 2 2 2 ...
$ Land: Factor w/ 29 levels "Australien","Belgien",..: 3 3 3 3 3 3 3 3 9 3 ...
$ direction: Factor w/ 2 levels "LL","RL": 1 1 1 1 1 1 1 1 2 1 ...
$ transmission: Factor w/ 2 levels "AUT","SCH": 1 1 1 1 1 1 1 1 1 1 ...
$ range: Factor w/ 4 levels "1","2","3","4": 3 3 3 2 1 3 2 4 3 2 ...
- attr(*, "na.action")= 'omit' Named int 6500 9748
..- attr(*, "names")= chr "6500" "9748"
我的火车头像这样:
> head(train,10)
number Land direction transmission range
1 reference Bundesrep. Deutschland LL AUT 3
2 reference Bundesrep. Deutschland LL AUT 3
3 reference Bundesrep. Deutschland LL AUT 3
4 reference Bundesrep. Deutschland LL AUT 2
5 reference Bundesrep. Deutschland LL AUT 1
6 reference Bundesrep. Deutschland LL AUT 3
7 reference Bundesrep. Deutschland LL AUT 2
8 problem Taiwan LL AUT 3
9 reference Bundesrep. Deutschland LL AUT 4
10 reference Grossbritannien RL AUT 3
11 reference Bundesrep. Deutschland LL SCH 2
这是我的代码:
library(DMWr)
smote_train <- SMOTE(train$number ~ ., data = train, perc.over=500, k =5)