我正在观看麻省理工学院的python和数据科学开放课程,6.0002。它在第13课讲授逻辑回归。这是代码:
import sklearn.linear_model
def buildModel(examples, toPrint = True):
featureVecs, labels = [],[]
for e in examples:
featureVecs.append(e.getFeatures())
labels.append(e.getLabel())
LogisticRegression = sklearn.linear_model.LogisticRegression
model = LogisticRegression().fit(featureVecs, labels)
if toPrint:
print('model.classes_ =', model.classes_)
print('model.coef_ =', model.coef_)
for i in range(len(model.coef_)):
print('For label', model.classes_[1])
for j in range(len(model.coef_[0])):
print(' ', Passenger.featureNames[j], '=',
model.coef_[0][j])
return model
model = buildModel(trainingSet, True)
这是输出:
model.classes_ = ['Died' 'Survived']
model.coef_ = [[ 1.63216725 0.4504459 -0.52476792 -0.03218074 -2.29930577]]
For label Survived
C1 = 1.63216724529
C2 = 0.450445901238
C3 = -0.524767915254
age = -0.0321807370827
male gender = -2.29930576947
我有两个问题:
为什么model.coef_只有一个元素而model.classes_有两个?他们的长度不一定匹配吗?
为什么model.coef_的值是标签的系数" Survived"而不是标签"死了"?我的意思是如果model.classes_和model.coef_中元素的顺序是相同的,则coef_中的数字应该是标签" Died",对吧?因为" Died"先来。
谢谢!