我正在创建自定义学习者,特别是我试图在mlr框架中使用h2o机器学习算法。 h2o.deeplearning函数的'hidden'参数是一个我想调整的整数向量。我用以下方式定义了'hidden'参数:
makeRLearner.classif.h2o_dl = function() {
makeRLearnerClassif(
cl = "classif.h2o_dl",
package = "h2o",
par.set = makeParamSet(
makeDiscreteLearnerParam(id = "activation",
values = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")),
makeNumericLearnerParam(id = "epochs", default = 10, lower = 1),
makeNumericLearnerParam(id = "rate", default = 0.005, lower = 0, upper = 1),
makeIntegerVectorLearnerParam(id = "hidden", default = c(100,100)),
makeDiscreteLearnerParam(id = "loss", values = c("Automatic",
"CrossEntropy", "Quadratic", "Absolute", "Huber"))
),
properties = c("twoclass", "multiclass", "numerics", "factors", "prob","missings"),
name = "Deep Learning Neural Network with h2o",
short.name = "h2o_deeplearning_classif",
note = "tbd"
)
}
trainLearner.classif.h2o_dl = function(.learner, .task,.subset,.weights=NULL, ...) {
f = getTaskFormula(.task)
data = getTaskData(.task, .subset)
data_h2o <- as.h2o(data,
destination_frame = paste0(
"train_",
format(Sys.time(), "%m%d%y_%H%M%S")))
h2o::h2o.deeplearning(x = getTaskFeatureNames(.task),
y = setdiff(names(getTaskData(.task)),
getTaskFeatureNames(.task)),
training_frame = data_h2o, ...)
}
predictLearner.classif.h2o_dl = function(.learner, .model, .newdata, predict.method = "plug-in", ...) {
data <- as.h2o(.newdata,
destination_frame = paste0("pred_",
format(Sys.time(), "%m%d%y_%H%M%S")))
p = predict(.model$learner.model, newdata = data, method = predict.method, ...)
if (.learner$predict.type == "response")
return(as.data.frame(p)[,1]) else return(as.matrix(as.numeric(p))[,-1])
}
我尝试通过网格搜索通过makeDiscreteParam
函数调整参数'hidden':
library(mlr)
library(h2o)
h2o.init()
lrn.h2o <- makeLearner("classif.h2o_dl")
n <- getTaskSize(sonar.task)
train.set = seq(1, n, by = 2)
test.set = seq(2, n, by = 2)
mod.h2o = train(lrn.h2o, sonar.task, subset = train.set)
pred.h2o <- predict(mod.h2o,task= sonar.task, subset = train.set)
ctrl = makeTuneControlGrid()
rdesc = makeResampleDesc("CV", iters = 3L)
ps = makeParamSet(
makeDiscreteParam("hidden", values = list(c(10,10),c(100,100))),
makeDiscreteParam("rate", values = c(0.1,0.5))
)
res = tuneParams("classif.h2o_dl", task = sonar.task, resampling = rdesc,par.set = ps,control = ctrl)
导致了警告信息
Warning messages:
1: In checkValuesForDiscreteParam(id, values) :
number of items to replace is not a multiple of replacement length
2: In checkValuesForDiscreteParam(id, values) :
number of items to replace is not a multiple of replacement length
和ps
看起来像这样:
ps
Type len Def Constr Req Tunable Trafo
hidden discrete - - 10,100 - TRUE -
rate discrete - - 0.1,0.5 - TRUE -
不会导致将隐藏参数调整为矢量。我还尝试了其他特殊的构造函数(例如makeNumericVectorParam
),它们也没有用。
有没有人在mlr调整(整数)向量的经验,可以给我一个提示?
答案 0 :(得分:1)
要调整“隐藏”参数,请在网格中使用以下代码:
makeDiscreteParam(id = "hidden", values = list(a = c(10,10), b = c(100,100)))
检查出来:
答案 1 :(得分:0)
发出警告消息和未能构造正确的ParamSet的原因是,ParamHelpers尝试将名称添加到值列表中,当值是向量时失败。无处不在的答案解决了这个问题,这就是它起作用的原因。
但是,当您要调整整数值的矢量时,最建议使用makeIntegerVectorParam
:
ps <- makeParamSet(
makeIntegerVectorParam("hidden", len = 2, lower = 10, upper = 100),
makeDiscreteParam("rate", values = c(0.1, 0.5))
)
这不仅会尝试c(10, 10)
和c(100, 100)
,还会尝试c(10, 100)
之类的值。
实际上,这也考虑了10到100之间的所有值(例如c(30, 80)
),因此可能希望使用转换来稍微减少搜索空间。示例:
ps <- makeParamSet(
makeIntegerVectorParam("hidden", len = 2, lower = 2, upper = 4,
trafo = function(x) round(10 ^ (x / 2))),
makeDiscreteParam("rate", values = c(0.1, 0.5))
)
对于隐藏层,可以任意组合使用值10(= 10 ^ 1),32(= 10 ^ 1.5)和100(= 10 ^ 2)。