Question

我正在研究分类问题，需要逻辑回归方程的系数。我可以在R中找到系数，但是我需要在python中提交项目。我找不到用于在python中学习逻辑回归系数的代码。如何在python中获取系数值？

Answer 1

您将像这样使用它：

from statsmodels.discrete.discrete_model import Logit
from statsmodels.tools import add_constant

x = [...] # Obesrvations
y = [...] # Response variable

x = add_constant(x)
print(Logit(y, x).fit().summary())

Answer 2

sklearn.linear_model.LogisticRegression适合您。参见以下示例：

group(0)

Answer 3

路飞，请记住始终分享您的代码和尝试，以便我们知道您的尝试并为您提供帮助。无论如何，我认为您正在寻找这个：

import numpy as np
from sklearn.linear_model import LogisticRegression

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) #Your x values, for a 2 variable model.
#y = 1 * x_0 + 2 * x_1 + 3 #This is the "true" model
y = np.dot(X, np.array([1, 2])) + 3 #Generating the true y-values
reg = LogisticRegression().fit(X, y) #Fitting the model given your X and y values.
reg.coef_ #Prints an array of all regressor values (b1 and b2, or as many bs as your model has)
reg.intercept_  #Prints value for intercept/b0 
reg.predict(np.array([[3, 5]])) #Predicts an array of y-values with the fitted model given the inputs

Answer 4

statsmodels 库将为您提供系数结果的细分，以及相关的p值以确定其重要性。

使用example的x1和y1变量：

x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, random_state=0)

logreg = LogisticRegression().fit(x1_train,y1_train)
logreg

print("Training set score: {:.3f}".format(logreg.score(x1_train,y1_train)))
print("Test set score: {:.3f}".format(logreg.score(x1_test,y1_test)))

import statsmodels.api as sm
logit_model=sm.Logit(y1,x1)
result=logit_model.fit()
print(result.summary())

示例结果：

Optimization terminated successfully.
         Current function value: 0.596755
         Iterations 7
                           Logit Regression Results                           
==============================================================================
Dep. Variable:             IsCanceled   No. Observations:                20000
Model:                          Logit   Df Residuals:                    19996
Method:                           MLE   Df Model:                            3
Date:                Sat, 17 Aug 2019   Pseudo R-squ.:                  0.1391
Time:                        23:58:55   Log-Likelihood:                -11935.
converged:                       True   LL-Null:                       -13863.
                                        LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -2.1417      0.050    -43.216      0.000      -2.239      -2.045
x1             0.0055      0.000     32.013      0.000       0.005       0.006
x2             0.0236      0.001     36.465      0.000       0.022       0.025
x3             2.1137      0.104     20.400      0.000       1.911       2.317
==============================================================================

Answer 5

假设您的X是Pandas DataFrame，而clf是您的Logistic回归模型，则可以通过以下代码行获取特征的名称及其值：

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_)), columns=['features', 'coef'])

Answer 6

最后的更正：

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_.tolist()[0])), columns=['features', 'coef'])

在python中查找逻辑回归的系数

6 个答案: