在大多数情况下,函数参数的默认值在文档中给出。 但是,在某些情况下,默认值是从其他参数(包括数据本身)计算的,因此无法在文档中指定。
例如,我如何发现库lambda
中函数glmnet
使用的默认glmnet
网格?
根据文档,默认lambda
的计算基于nlambda
,默认为100
,lambda.min.ratio
,这似乎是数据派生值。
当我使用给定的数据集运行此函数时,我想知道它使用的lambda
的值。这在使用cv.glmnet
时特别有用,因为我想知道当我不提供时会选择哪个lambda
。
示例输入:
library(glmnet)
set.seed(1)
x=rnorm(100)
eps=rnorm(100)
y = 1 + x + x^2 + x^3 + eps
xmat=model.matrix(y~poly(x,10,raw=T),data=data.frame(x=x))
cv.out=cv.glmnet(xmat, y,alpha=0) # What is the lambda used here?
bestlam=cv.out$lambda.min
print(bestlam)
# When a grid is specified, the result is very different and sometimes worse.
grid=10^seq(10,-2,length=100)
cv.out=cv.glmnet(xmat, y,alpha=0, lambda=grid)
bestlam=cv.out$lambda.min
print(bestlam)
示例输出(注意它们非常不同):
0.3619167
0.04037017
答案 0 :(得分:4)
如果默认值取决于其他参数的值,那么在调用时,我看不到其他解决方案,只能在调试模式下输入该函数。您可以使用debugonce
例如:
> debugonce(cv.glmnet)
>
> cv.out=cv.glmnet(xmat, y,alpha=0) # What is the lambda used here?
debugging in: cv.glmnet(xmat, y, alpha = 0)
[...]
Browse[2]> ls()
# [1] "foldid" "grouped" "keep" "lambda" "nfolds" "offset"
# [7] "parallel" "type.measure" "weights" "x" "y"
Browse[2]> lambda
NULL
Browse[2]> c
>
因此,对于第一次通话,lambda
为NULL
。但是,如果您在第二次调用cv.glmnet
时重复此方法,则会发现在这种情况下lambda
是一个长度为100的数字向量。
答案 1 :(得分:3)
我很惊讶这些都没有发布,但明显的功能是args
和formals
:
args
只显示没有正文的函数的“顶部”,与调用cv.glmnet
不同:
> args(cv.glmnet)
function (x, y, weights, offset = NULL, lambda = NULL, type.measure = c("mse",
"deviance", "class", "auc", "mae"), nfolds = 10, foldid,
grouped = TRUE, keep = FALSE, parallel = FALSE, ...)
NULL
formals
将这些参数作为列表提供:
> formals(cv.glmnet)
$x
$y
$weights
$offset
NULL
$lambda
NULL
$type.measure
c("mse", "deviance", "class", "auc", "mae")
$nfolds
[1] 10
$foldid
$grouped
[1] TRUE
$keep
[1] FALSE
$parallel
[1] FALSE
$...
答案 2 :(得分:0)
您可以随时输入函数名称并按Enter键以获取函数的源代码。在您给出的示例中,lambda默认为NULL。
cv.glmnet
## function (x, y, weights, offset = NULL, lambda = NULL, type.measure = c("mse",
## "deviance", "class", "auc", "mae"), nfolds = 10, foldid,
## grouped = TRUE, keep = FALSE, parallel = FALSE, ...)
## {
## if (missing(type.measure))
## type.measure = "default"
## else type.measure = match.arg(type.measure)
## if (!is.null(lambda) && length(lambda) < 2)
## stop("Need more than one value of lambda for cv.glmnet")
## N = nrow(x)
## if (missing(weights))
## weights = rep(1, N)
## else weights = as.double(weights)
## y = drop(y)
## glmnet.call = match.call(expand.dots = TRUE)
## which = match(c("type.measure", "nfolds", "foldid", "grouped",
## "keep"), names(glmnet.call), F)
## if (any(which))
## glmnet.call = glmnet.call[-which]
## glmnet.call[[1]] = as.name("glmnet")
## glmnet.object = glmnet(x, y, weights = weights, offset = offset,
## lambda = lambda, ...)
## glmnet.object$call = glmnet.call
## is.offset = glmnet.object$offset
## lambda = glmnet.object$lambda
## if (inherits(glmnet.object, "multnet")) {
## nz = predict(glmnet.object, type = "nonzero")
## nz = sapply(nz, function(x) sapply(x, length))
## nz = ceiling(apply(nz, 1, median))
## }
## else nz = sapply(predict(glmnet.object, type = "nonzero"),
## length)
## if (missing(foldid))
## foldid = sample(rep(seq(nfolds), length = N))
## else nfolds = max(foldid)
## if (nfolds < 3)
## stop("nfolds must be bigger than 3; nfolds=10 recommended")
## outlist = as.list(seq(nfolds))
## if (parallel && require(foreach)) {
## outlist = foreach(i = seq(nfolds), .packages = c("glmnet")) %dopar%
## {
## which = foldid == i
## if (is.matrix(y))
## y_sub = y[!which, ]
## else y_sub = y[!which]
## if (is.offset)
## offset_sub = as.matrix(offset)[!which, ]
## else offset_sub = NULL
## glmnet(x[!which, , drop = FALSE], y_sub, lambda = lambda,
## offset = offset_sub, weights = weights[!which],
## ...)
## }
## }
## else {
## for (i in seq(nfolds)) {
## which = foldid == i
## if (is.matrix(y))
## y_sub = y[!which, ]
## else y_sub = y[!which]
## if (is.offset)
## offset_sub = as.matrix(offset)[!which, ]
## else offset_sub = NULL
## outlist[[i]] = glmnet(x[!which, , drop = FALSE],
## y_sub, lambda = lambda, offset = offset_sub,
## weights = weights[!which], ...)
## }
## }
## fun = paste("cv", class(glmnet.object)[[1]], sep = ".")
## cvstuff = do.call(fun, list(outlist, lambda, x, y, weights,
## offset, foldid, type.measure, grouped, keep))
## cvm = cvstuff$cvm
## cvsd = cvstuff$cvsd
## cvname = cvstuff$name
## out = list(lambda = lambda, cvm = cvm, cvsd = cvsd, cvup = cvm +
## cvsd, cvlo = cvm - cvsd, nzero = nz, name = cvname, glmnet.fit = glmnet.object)
## if (keep)
## out = c(out, list(fit.preval = cvstuff$fit.preval, foldid = foldid))
## lamin = if (type.measure == "auc")
## getmin(lambda, -cvm, cvsd)
## else getmin(lambda, cvm, cvsd)
## obj = c(out, as.list(lamin))
## class(obj) = "cv.glmnet"
## obj
## }
## <environment: namespace:glmnet>