程序包不平衡:错误“在数据集中找不到某些类属性”

时间:2018-12-20 10:28:56

标签: r

我想使用过采样代码来平衡我的数据与imbalance包。 尝试这段代码时,它给了我错误:

new_train <- oversample(train, method = "ADASYN")

错误代码

  

checkDatasetClass(dataset,classAttr)中的错误:某些类   数据集中找不到属性

我的数据如下:

> head(train)
    case           country   steering     type           group 
1  bad              Europe      LL         AUT             3
2 good              Europe      LL         AUT             2
3 good              Europe      LL         AUT             2
4 good              Europe      LL         SCH             2
5 good              Europe      RL         AUT             2
6 good              Europe      LL         AUT             1

> str(train)
'data.frame':   11479 obs. of  5 variables:
 $ case : Factor w/ 2 levels "bad",..: 1 2 2 2 2 2 2 2 2 2 ...
 $ country: Factor w/ 9 levels "Africa","LatinAmerica",..: 6 6 6 6 6 6 6 6 6 6 ...
 $ steering: Factor w/ 2 levels "LL","RL": 1 1 1 1 2 1 2 1 1 1 ...
 $ type: Factor w/ 2 levels "AUT","SCH": 1 1 1 2 1 1 1 1 1 1 ...
 $ group: Factor w/ 3 levels "1","2","3": 3 2 2 2 2 1 2 3 3 2 ...

我已经用

删除了NA。
which(is.na(train))
train <- na.omit(train)

1 个答案:

答案 0 :(得分:0)

指定目标变量。例如:

serhat_simsek <- oversample(train, method = "ADASYN",classAttr = "group")