我是glmnet的新手并且使用了penalty.factor选项。小插图说它“对于某些变量可以为0,这意味着没有收缩,并且该变量总是包含在模型中。” the longer PDF document有代码。因此,我预计使用intercept = TRUE
进行回归并且x
中没有常量,与intercept = FALSE
相同,而x
中的常量与penalty.factor = 0
相同。但是下面的代码表明它不是:后一种情况的截距为0,其他两个系数比前者大20%。
library("glmnet")
set.seed(7)
# penalty for the intercept
intercept_penalty <- 0
# Simulate data with 2 features
num_regressors <- 2
num_observations <- 100
X <- matrix(rnorm(num_regressors * num_observations),
ncol = num_regressors,
nrow = num_observations)
# Add an intercept in the right-hand side matrix: X1 = (intercept + X)
X1 <- cbind(matrix(1, ncol = 1, nrow = num_observations), X)
# Set random parameters for the features
beta <- runif(1 + num_regressors)
# Generate observations for the left-hand side
Y <- X1 %*% beta + rnorm(num_observations) / 10
# run OLS
ols <- lm(Y ~ X)
coef_ols <- coef(ols)
# Run glmnet with an intercept in the command, not in the matrix
fit <- glmnet(y = Y,
x = X,
intercept = T,
penalty.factor = rep(1, num_regressors),
lambda = 0)
coef_intercept_equal_true <- coef(fit)
# run glmnet with an intercept in the matrix with a penalty
# factor of intercept_penalty for the intercept and 1 for the rest
fit_intercept_equal_false <- glmnet(y = Y,
x = X1,
intercept = F,
penalty.factor = c(intercept_penalty, rep(1, num_regressors)),
lambda = 0)
coef_intercept_equal_false <- coef(fit_intercept_equal_false)
# Compare all three methods in a data frame
# For lasso_intercept_equal_false, the index starts at 2 because
# position 1 is reserved for intercepts, which is missing in this case
comparison <- data.frame(original = beta,
ols = coef_ols,
lasso_intercept_equal_true = coef_intercept_equal_true[1:length(coef_intercept_equal_true)],
lasso_intercept_equal_false = coef_intercept_equal_false[2:length(coef_intercept_equal_false)]
)
comparison$difference <- comparison$lasso_intercept_equal_false - comparison$lasso_intercept_equal_true
comparison
此外,此示例的差异与截距项的不同惩罚因子相同,无论intercept_penalty
是否等于0,1,3000,-10等。差异与正面惩罚相似,例如lambda = 0.01
。
如果这不是一个错误,惩罚因素的正确用法是什么?
答案 0 :(得分:2)
我联系了作者,他确认这是一个错误,并补充说它是在他的错误修复列表中。与此同时,一种解决方法是将回归量集中在一起,例如:与
fit_centered <- glmnet(y = Y,
x = scale(X1, T, F),
intercept = F,
lambda = 0)
在这种情况下,惩罚因素无关紧要。这是一个修改过的脚本,比较OLS,LASSO和拦截,LASSO没有拦截,LASSO和居中的回归量:
library("glmnet")
set.seed(7)
# Simulate data with 2 features
num_regressors <- 2
num_observations <- 100
X <- matrix(rnorm(num_regressors * num_observations),
ncol = num_regressors,
nrow = num_observations)
# Add an intercept in the right-hand side matrix: X1 = (intercept + X)
X1 <- cbind(matrix(1, ncol = 1, nrow = num_observations), X)
# Set random parameters for the features
beta <- runif(1 + num_regressors)
# Generate observations for the left-hand side
Y <- X1 %*% beta + rnorm(num_observations) / 10
# run OLS
ols <- lm(Y ~ X)
coef_ols <- coef(ols)
# Run glmnet with an intercept in the command, not in the matrix
fit <- glmnet(y = Y,
x = X,
intercept = T,
penalty.factor = rep(1, num_regressors),
lambda = 0)
coef_intercept <- coef(fit)
# run glmnet with an intercept in the matrix with a penalty
# factor of 0 for the intercept and 1 for the rest
fit_no_intercept <- glmnet(y = Y,
x = X1,
intercept = F,
lambda = 0)
coef_no_intercept <- coef(fit_no_intercept)
# run glmnet with an intercept in the matrix with a penalty
# factor of 0 for the intercept and 1 for the rest
# If x is centered, it works (even though y is not centered). Center it with:
# X1 - matrix(colMeans(X1), nrow = num_observations, ncol = 1 + num_regressors, byrow = T)
# or with
# X1_centered = scale(X1, T, F)
fit_centered <- glmnet(y = Y,
x = scale(X1, T, F),
intercept = F,
lambda = 0)
coef_centered <- coef(fit_centered)
# Compare all three methods in a data frame
# For lasso_intercept and the others, the index starts at 2 because
# position 1 is reserved for intercepts, which is missing in this case
comparison <- data.frame(ols = coef_ols,
lasso_intercept = coef_intercept[1:length(coef_intercept)],
lasso_no_intercept = coef_no_intercept[2:length(coef_no_intercept)],
lasso_centered = coef_centered[2:length(coef_centered)]
)
comparison$diff_intercept <- comparison$lasso_intercept - comparison$lasso_no_intercept
comparison$diff_centered <- comparison$lasso_centered - comparison$lasso_intercept
comparison
答案:
ols lasso_intercept lasso_no_intercept lasso_centered diff_intercept diff_centered
(Intercept) 0.9748302 0.9748302 0.0000000 0.0000000 0.9748302 -9.748302e-01
X1 0.6559541 0.6559541 0.7974851 0.6559541 -0.1415309 2.220446e-16
X2 0.7986957 0.7986957 0.9344306 0.7986957 -0.1357348 4.440892e-16
对于具有居中回归量的LASSO,估计的截距为0,但其他系数与带截距的LASSO相同。