我使用以下CRAN包 DMwR 来处理不平衡数据的问题: 代码如下:
require(DMwR)
dm = read.table("C:/data/exampleData.txt", sep=",")
ncols<-ncol(dm)
dm<-cbind(dm[2:ncols],dm[1])
dmSmote<-SMOTE(target ~ . , dm,k=5,perc.over = 1400,perc.under=140)
dm<-cbind(dmSmote[ncols],dmSmote[1:ncols-1])
数据:
5.901487,5.176487,1
6.917943,3.979710,0
5.247007,3.628324,1
5.157673,6.212658,0
4.836749,3.978392,0
4.590970,5.547353,0
3.895904,5.350865,0
4.312977,3.853151,0
5.844978,5.450767,0
4.009195,5.108031,0
第1列 =变量1,第2列 =变量2,第3列 =类
我收到以下错误: 尝试将属性更改为NULL
图书馆链接:http://cran.fhcrc.org/web/packages/DMwR/DMwR.pdf
我不对的是什么?
答案 0 :(得分:4)
分类器变量(代码中的target
)需要是一个因素。
require(DMwR)
## data
dm = structure(
c(5.901487, 6.917943, 5.247007, 5.157673, 4.836749,
4.59097, 3.895904, 4.312977, 5.844978, 4.009195, 5.176487, 3.97971,
3.628324, 6.212658, 3.978392, 5.547353, 5.350865, 3.853151, 5.450767,
5.108031, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0),
.Dim = c(10L, 3L),
.Dimnames = list(NULL, NULL))
dm = data.frame(dm)
## column names
colnames(dm) = c("var1", "var2", "target")
## you must convert the classifier variable to a factor
dm$target = factor(dm$target)
## SMOTE algorithm
dmSmote <- SMOTE(target ~ ., data = dm, k = 5,perc.over = 1400, perc.under = 140)
在相关功能上使用debug()
是诊断错误的良好起点。