我想知道如何从python statsmodels中的拟合逻辑回归模型中获得优势比。
>>> import statsmodels.api as sm
>>> import numpy as np
>>> X = np.random.normal(0, 1, (100, 3))
>>> y = np.random.choice([0, 1], 100)
>>> res = sm.Logit(y, X).fit()
Optimization terminated successfully.
Current function value: 0.683158
Iterations 4
>>> res.summary()
<class 'statsmodels.iolib.summary.Summary'>
"""
Logit Regression Results
==============================================================================
Dep. Variable: y No. Observations: 100
Model: Logit Df Residuals: 97
Method: MLE Df Model: 2
Date: Sun, 05 Jun 2016 Pseudo R-squ.: 0.009835
Time: 23:25:06 Log-Likelihood: -68.316
converged: True LL-Null: -68.994
LLR p-value: 0.5073
==============================================================================
coef std err z P>|z| [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1 -0.0033 0.181 -0.018 0.985 -0.359 0.352
x2 0.0565 0.213 0.265 0.791 -0.362 0.475
x3 0.2985 0.216 1.380 0.168 -0.125 0.723
==============================================================================
"""
>>>
答案 0 :(得分:1)
您可以通过以下方式获得赔率:
np.exp(res.params)
还要获得置信区间(source):
params = res.params
conf = res.conf_int()
conf['OR'] = params
conf.columns = ['2.5%', '97.5%', 'OR']
print(np.exp(conf))
免责声明:我刚刚将评论整理到您的问题中。
答案 1 :(得分:1)
不确定 statsmodels,在 sklearn 中做:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
logisticRegr = LogisticRegression()
logisticRegr.fit(x_train, y_train)
df=pd.DataFrame({'odds_ratio':(np.exp(logisticRegr.coef_).T).tolist(),'variable':x.columns.tolist()})
df['odds_ratio'] = df['odds_ratio'].str.get(0)
df=df.sort_values('odds_ratio', ascending=False)
df