我正在尝试估计R中的logistic回归,以手工计算一切。 我可以创建logit和loglikelihood函数,但是无法使用som非线性求解器来解决它
我想寻求建议
df <- read_csv("http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/RProgramming/data/Default.csv")
df
df$default = ifelse(df$default == "Yes", 1, 0)
logit <- function(x, b0, b1) {
1/(1 + exp(-b0 - b1*x))
}
Loglikel <- function(y, x, b0, b1) {
b0 = rep(b0, length(y))
b1 = rep(b1, length(y))
p <- logit(x, b0, b1)
sum(y*log(p) + (1 - y)*log(1- p))
}
Loglikel(df$default, df$balance, -10, 0.005)
library(stats4)
mle(Loglikel,
start = list(b0 = 0, b1 = 0),
fixed = list(y = df$default, x = df$balance))
答案 0 :(得分:1)
我接受了您的代码并对其进行了一些修改,以将参数作为矢量传递:
df <- read_csv("http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/RProgramming/data/Default.csv")
df$default <- ifelse(df$student == "Yes", 1, 0)
logit <- function(x, b0, b1) {
1/(1 + exp(-b0-b1*x))
}
Loglikel <- function(par, y, x){
p <- logit(x, par[1], par[2])
sum(y*log(p) + (1-y)*log(1-p))
}
我们现在准备使用非线性求解器(例如nlm
)来获取参数估计值:
nlm_fit <- nlm(Loglikel, p = c(-2,0.001), x=df$balance, y=df$default)
给出
> nlm_fit
...
$estimate
[1] -2.0002960 -0.2666521
...
nlm
使用Newton-Raphson型求解器来最小化MLE。同时,glm
使用迭代加权最小二乘算法,这意味着glm
和nlm
的输出不必达成共识:
glm_fit <- glm(default ~ balance, family = binomial(link="logit"), data = df)
> glm_fit
Call: glm(formula = default ~ balance, family = binomial(link = "logit"),
data = df)
Coefficients:
(Intercept) balance
-1.7004224 0.0009409
检查此link,可以很好地总结glm
内部的情况。