我正在复制代码50次,然后我想要平均输出!
正在复制的代码:
output <- gbm.step(data=data.sample,
gbm.x = 2:9,
gbm.y = 1,
family = "poisson",
tree.complexity = 3,
learning.rate = 0.0002,
bag.fraction = 0.6)
我需要弄清楚不同元参数(tree.complexity,learning.rate和bag.fraction)的哪些值给出了最佳模型。有一个响应变量和8个预测变量。
因此1复制的输出看起来像这样
'fitting final gbm model with a fixed number of 850 trees for Freq
mean total deviance = 292.371
mean residual deviance = 214.589
estimated cv deviance = 264.341 ; se = 53.483
training data correlation = 0.568
cv correlation = 0.565 ; se = 0.053'
我想从50次迭代中得出估计的cv偏差分数的平均值。
我对R很新,所以任何帮助都将不胜感激!
答案 0 :(得分:1)
您可以使用f
定义将运行50次的函数replicate
。然后从每次运行中提取偏差并取其平均值如下:
f <- function(d) {
output <- gbm.step(data=d,
gbm.x = 2:9,
gbm.y = 1,
family = "poisson",
tree.complexity = 3,
learning.rate = 0.0002,
bag.fraction = 0.6)
return(output)
}
# Use simplify = FALSE to get the result in a list,
# rather than coerced to an array
v <- replicate(50, f(data.sample), simplify = FALSE)
# Gather all deviance means in v in a vector
deviances <- sapply(v, function(x) x$cv.statistics$deviance.mean)
# Finally take the mean of the deviances
dev.mean <- mean(deviances)