为什么sklearn的Logistic回归与书面的是如此不同?

时间:2019-04-14 09:51:49

标签: python scikit-learn logistic-regression

我尝试抱怨sklearn函数和this逻辑回归类

class LogisticRegression(object):
    def __init__(self, eta=0.1, n_iter=50):
        self.eta = eta
        self.n_iter = n_iter

    def fit(self, X, y):
        X = np.insert(X, 0, 1, axis=1)
        self.w = np.ones(X.shape[1])
        m = X.shape[0]

        for _ in range(self.n_iter):
            output = X.dot(self.w)
            errors = y - self._sigmoid(output)
            self.w += self.eta / m * errors.dot(X)
        return self

    def predict(self, X):
        output = np.insert(X, 0, 1, axis=1).dot(self.w)
        return (np.floor(self._sigmoid(output) + .5)).astype(int)

    def score(self, X, y):
        return sum(self.predict(X) == y) / len(y)

    def _sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

我有一些数据:

X= [[  1.00000000e+01   1.00000000e+00   1.99900000e+03   1.40000000e+01   1.42500000e+01   1.40000000e+01   1.41250000e+01
9.90424100e+00   3.16240000e+06]...]
y=[1.0 1.0 0.0 ...]

然后我尝试使用此类:

log1 = LogisticRegression().fit(X, y)
print(log1.predict(X))
print(collections.Counter(log1.predict(X)))

结果:[0 0 0 ...,0 0 0]计数器({0:4899})。只有零。我觉得很奇怪

当我尝试使用sklearn Logistic回归

from sklearn.linear_model import LogisticRegression

log2 = LogisticRegression()
log2.fit(X,y)
print(log2.predict(X))

我得到另一个结果:[1. 1. 1. ...,0. 1. 0.] 计数器({1.0:2653,0.0:2246})

问题出在哪里?

0 个答案:

没有答案