Question

我想在 python 中执行一个简单的逻辑回归（1 个因变量，1 个自变量）。我在 python 中看到的所有关于逻辑回归的文档都是为了使用它来开发预测模型。我想从统计方面更多地使用它。如何在 Python 上找到简单逻辑回归的 Odds ratio、p-value 和 confidence interval？

X = df[predictor]
y = df[binary_outcome]

model = LogisticRegression()
model.fit(X,y)

print(#model_stats)

理想输出为 Odds ratio、p-value 和 confidence interval

Answer 1

我假设您使用的是 LogisticRegression() 中的 sklearn。您无法从中估计 p 值置信区间。您可以使用 statsmodels，还要注意没有公式的 statsmodels 与 sklearn 有点不同（请参阅@Josef 的评论），因此您需要使用 sm.add_constant() 添加拦截：

import statsmodels.api as sm

y = np.random.choice([0,1],50)
x = np.random.normal(0,1,50)

model = sm.GLM(y, sm.add_constant(x), family=sm.families.Binomial())
results = model.fit()
results.summary()

Generalized Linear Model Regression Results
Dep. Variable:  y   No. Observations:   50
Model:  GLM Df Residuals:   48
Model Family:   Binomial    Df Model:   1
Link Function:  logit   Scale:  1.0000
Method: IRLS    Log-Likelihood: -33.125
Date:   Sat, 09 Jan 2021    Deviance:   66.250
Time:   16:21:51    Pearson chi2:   50.1
No. Iterations: 4       
Covariance Type:    nonrobust       
coef    std err z   P>|z|   [0.025  0.975]
const   -0.0908 0.309   -0.294  0.769   -0.696  0.514
x1  0.5975  0.361   1.653   0.098   -0.111  1.306

系数以对数赔率表示，您可以简单地将其转换为赔率比。 [0.025 0.975] 列是对数赔率的 95% 置信区间。查看help page for more info

贝塔系数和 p 值与 Python 中的 l Logistic 回归

1 个答案: