如果R?的逻辑回归中可能存在多级因素预测变量,则对应于主要概率水平的预测值。

时间:2019-06-20 17:56:36

标签: r plot line logistic-regression

完整问题(> 150个字符):
在Logistic回归中可能存在多级因子(分类)预测变量的情况下,如何处理对应于p = 0.25,0.50,0.75概率水平的预测值?

在连续的预测变量的情况下,在逻辑回归中获得对应于p = 0.25,0.50,0.75概率水平的预测值很容易。看:

df <- data.frame(hour=c(0.50,0.75,1.00,1.25,1.50,1.75,1.75,2.00,2.25,2.50,2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50), pass=c(0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1))

df

df$pass <- as.factor(df$pass)
my_fit <- glm(df$pass ~ df$hour, data=df, na.action=na.exclude, family="binomial")
summary(my_fit)

my_table <- summary(my_fit)     
my_table$coefficients[,1] <- invlogit(coef(my_fit))
my_table

plot(df$hour, df$pass, xlab="x", ylab="logit values")

LinearPredictions <- predict(my_fit); LinearPredictions

EstimatedProbability.hat <- exp(LinearPredictions)/(1 + exp(LinearPredictions))
EstimatedProbability.hat

EstimatedProbability <- c(0.25, 0.50, 0.75) # Estimated probabilities for which their x levels are wanted to be found

HoursStudied <- (log(EstimatedProbability/(1- EstimatedProbability)) - my_fit$coefficients[1])/ my_fit$coefficients[2]
HoursStudied.summary <- data.frame(EstimatedProbability, HoursStudied)
HoursStudied.summary
EstimatedProbability HoursStudied
#1                 0.25     1.979936
#2                 0.50     2.710083
#3                 0.75     3.440230

因此,在逻辑回归图中将y = 0.25,y = 0.50,y = 0.75水平线和x = 1.97,x = 2.71,x = 344垂直线相加。

但是,当预测变量为a时,如何在逻辑回归图中(通过plotggplot)添加y = 0.25,y = 0.50,y = 0.75水平线及其对应的垂直线。因素变量,可能具有两个以上的水平?或者,这样做完全不合逻辑?

0 个答案:

没有答案