R的平均输出

时间:2014-07-07 09:33:33

标签: r

我正在复制代码50次,然后我想要平均输出!

正在复制的代码:

output <- gbm.step(data=data.sample, 
                            gbm.x = 2:9,
                            gbm.y = 1,
                          family = "poisson",
                            tree.complexity = 3,
                            learning.rate = 0.0002,
                            bag.fraction = 0.6)

我需要弄清楚不同元参数(tree.complexity,learning.rate和bag.fraction)的哪些值给出了最佳模型。有一个响应变量和8个预测变量。

因此1复制的输出看起来像这样

'fitting final gbm model with a fixed number of  850  trees for  Freq 

mean total deviance = 292.371 
mean residual deviance = 214.589 

estimated cv deviance = 264.341 ; se = 53.483 

training data correlation = 0.568 
cv correlation =  0.565 ; se = 0.053' 

我想从50次迭代中得出估计的cv偏差分数的平均值。

我对R很新,所以任何帮助都将不胜感激!

1 个答案:

答案 0 :(得分:1)

您可以使用f定义将运行50次的函数replicate。然后从每次运行中提取偏差并取其平均值如下:

f <- function(d) {
    output <- gbm.step(data=d, 
        gbm.x = 2:9,
        gbm.y = 1,
        family = "poisson",
        tree.complexity = 3,
        learning.rate = 0.0002,
        bag.fraction = 0.6)
    return(output)
}

# Use simplify = FALSE to get the result in a list, 
# rather than coerced to an array
v <- replicate(50, f(data.sample), simplify = FALSE) 
# Gather all deviance means in v in a vector
deviances <- sapply(v, function(x) x$cv.statistics$deviance.mean)
# Finally take the mean of the deviances
dev.mean <- mean(deviances)