sklearn.linear_model.LogisticRegression中的系数

时间:2018-03-30 10:53:31

标签: python machine-learning scikit-learn regression logistic-regression

我正在观看麻省理工学院的python和数据科学开放课程,6.0002。它在第13课讲授逻辑回归。这是代码:

import sklearn.linear_model

def buildModel(examples, toPrint = True):
    featureVecs, labels = [],[]
    for e in examples:
        featureVecs.append(e.getFeatures())
        labels.append(e.getLabel())
    LogisticRegression = sklearn.linear_model.LogisticRegression
    model = LogisticRegression().fit(featureVecs, labels)
    if toPrint:
        print('model.classes_ =', model.classes_)
        print('model.coef_ =', model.coef_)
        for i in range(len(model.coef_)):
            print('For label', model.classes_[1])
            for j in range(len(model.coef_[0])):
                print('   ', Passenger.featureNames[j], '=',
                      model.coef_[0][j])
    return model

model = buildModel(trainingSet, True)

这是输出:

model.classes_ = ['Died' 'Survived']
model.coef_ = [[ 1.63216725  0.4504459  -0.52476792 -0.03218074 -2.29930577]]
For label Survived
    C1 = 1.63216724529
    C2 = 0.450445901238
    C3 = -0.524767915254
    age = -0.0321807370827
    male gender = -2.29930576947

我有两个问题:

  1. 为什么model.coef_只有一个元素而model.classes_有两个?他们的长度不一定匹配吗?

  2. 为什么model.coef_的值是标签的系数" Survived"而不是标签"死了"?我的意思是如果model.classes_和model.coef_中元素的顺序是相同的,则coef_中的数字应该是标签" Died",对吧?因为" Died"先来。

  3. 谢谢!

0 个答案:

没有答案