我正在根据mlr-tutorial中的给定示例定义新的用于异常值检测的预处理包装程序: [https://mlr.mlr-org.com/articles/tutorial/preproc.html#preprocessing-with-makepreprocwrappercaret][1]
我要具体进行的工作是基于中位数绝对偏差(MAD)集成离群值检测作为一种可靠的离群值度量。我已经编写了DoubleMADsFromMedian()函数,该函数在for循环中用于识别每个功能列中的异常值,并将识别出的异常值单元设置为“ NA”。
我从以下训练和预测功能入手:
trainfun = function(data, target, args = crit) {
for (element in data){
vec_temp <- as.numeric(data$element)
outlier <- DoubleMADsFromMedian(vec_temp)>crit
outlier <- as.data.frame(outlier)
df <- cbind(vec_temp,outlier)
df <- df %>% mutate(vec_temp = replace(vec_temp, outlier==TRUE, NA)) %>% data.frame()%>% select(.data$vec_temp )
}
# Store the outlier parameter in control
# These are needed to preprocess the data before prediction
control = args
if (is.logical(control$crit) && control$crit)
control$crit = attr(x, "outlier:crit")
data = as.data.frame(df)
return(list(data = data, control = control))
}
predictfun = function(data, target, args, control) {
data = crit(data, outlier = control$crit)
data = as.data.frame(data)
return(data)
}
在那之后,我像这样定义了预处理包装器:
lrn = makePreprocWrapper(lrn, train = trainfun, predict = predictfun, par.set=makeParamSet(makeIntegerParam("crit")), par.vals =list("crit"=4))
但是,出现以下错误:
Error in .learner$train(data = getTaskData(.task, .subset, functionals.as = "matrix"), :
object 'crit' not found
由于我不想调整基于MAD的离群标准,因此我想知道如何在mlr程序包中定义train- / predict-function的“ args”和“ control”参数? 而且,我想我的训练和预测功能肯定犯了几个错误?