在Cross Validate glmnet: which is the reference category or class in multinomial regression?中提出我的问题,有人可以解释我们如何在glmnet中为多项逻辑回归设置参考类别?
尽管glmnet用于应用缩小方法(Ridge,Lasso等),但它的文档和glmnet论坛都没有回答这个问题。
提前谢谢
答案 0 :(得分:1)
嗯,不,你不能在函数glmnet中做到这一点,但你可以在使用model.matrix运行函数之前很容易地做到这一点:
a <- factor( rep(c("cat1", "cat2", "cat3", "no-cat"),50) ) #make a factor
levels(a) <- c("no-cat", "cat1", "cat2", "cat3") #change the order of the levels because
#the first category is always the reference category using the model.matrix function
df <- data.frame(a) #put the factor in a dataframe
dummy_a <- model.matrix(~a,data=df) #make dummies for the factor.
#Note the first category of the levels(a) will get excluded i.e.
#become the reference category
cat_dummified <- dummy_a[,2:4] #the first column is the intercept i.e. a column of 1s
#which we exclude here
> head(cat_dummified)
acat1 acat2 acat3
1 0 0 0
2 1 0 0
3 0 1 0
4 0 0 1
5 0 0 0
6 1 0 0
> class(cat_dummified)
[1] "matrix"
cat_dummified
也是class matrix
,可以在glmnet
函数中使用。
这样你只有3个假人,你将有系数,并参考无猫类别。
希望这有帮助!