实施逻辑回归 - 为什么这不会收敛?

时间:2016-11-09 22:51:21

标签: python numpy logistic-regression

我正在调整现有的逻辑回归实现,但我无法弄清楚我做错了什么。

这是我的实施:

from scipy.optimize import fmin_bfgs
import numpy as np
import pandas as pd
# With help from http://stackoverflow.com/questions/13794754/logistic-regression-using-scipy
# as well as https://bryantravissmith.com/2015/12/29/implementing-logistic-regression-from-scratch-part-2-python-code/

def sigma(features, weights):
    """returns sigma(<w,x>)"""
    return 1 / (1 + np.exp(-features.dot(weights)))


def log_likelihood(weights, features, labels):
    """calculates -ln p(t|w)"""
    s = sigma(features, weights)
    #s += 1e-24  # pseudocount to prevent logs of 0
    t = labels * np.log(s + 1e-24)
    t2 = (1 - labels) * (np.log((1 - s) + 1e-24))
    ll = (t + t2).sum()
    print -ll
    return -ll


def gradient_log_likelihood(weights, features, labels):
    """calculates the gradient (Jacobian) of the log likelihood"""
    error = labels - sigma(features, weights)
    grad = (error * features).sum(axis=0)
    return grad.reshape(grad.shape[0], 1)

以下是一个示例数据集:

labels = np.array([0, 1, 1]).reshape(3, 1)
df = pd.DataFrame.from_dict({'a': [1,2,3], 'b': [2,3,4], 'c': [6,7,8]})

n, m = df.shape
weights = np.zeros(m + 1).reshape(m + 1, 1)  # zero vector of starting weights

# add the intercept column
features = np.ones((n, m + 1))  # make matrix with all 1's
features[:,1:] = df  # replace the 1's in all columns after column 0 with actual data

如果我在开始权重向量上单独运行这些方法,它们就会运行。但是一旦我尝试优化,我就会出现形状错误:

optimized = fmin_bfgs(log_likelihood, x0=weights, args=(features, labels), gtol=1e-4, fprime=gradient_log_likelihood)

ValueError                                Traceback (most recent call last)
<ipython-input-26-34c3cde48ac4> in <module>()
----> 1 optimized = fmin_bfgs(log_likelihood, x0=weights, args=(features, labels), gtol=1e-4, fprime=gradient_log_likelihood)

/Users/ifiddes/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in fmin_bfgs(f, x0, fprime, args, gtol, norm, epsilon, maxiter, full_output, disp, retall, callback)
    791             'return_all': retall}
    792
--> 793     res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
    794
    795     if full_output:

/Users/ifiddes/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options)
    845     else:
    846         grad_calls, myfprime = wrap_function(fprime, args)
--> 847     gfk = myfprime(x0)
    848     k = 0
    849     N = len(x0)

/Users/ifiddes/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in function_wrapper(*wrapper_args)
    287     def function_wrapper(*wrapper_args):
    288         ncalls[0] += 1
--> 289         return function(*(wrapper_args + args))
    290
    291     return ncalls, function_wrapper

<ipython-input-3-9678bc972b41> in gradient_log_likelihood(weights, features, labels)
      2         """calculates the gradient (Jacobian) of the log likelihood"""
      3         error = labels - sigma(features, weights)
----> 4         grad = (error * features).sum(axis=0)
      5         return grad.reshape(grad.shape[0], 1)
      6

ValueError: operands could not be broadcast together with shapes (3,3) (3,4)

1 个答案:

答案 0 :(得分:0)

问题在于这一行:

android.app.Activity

error = (labels - sigma(features, weights)) 从3 x 1向量转换为3 x 3矩阵。

请注意,如果您打印error并运行error,则会获得输出:

gradient_log_likelihood(weights, features, labels)

如果你运行优化,你会得到:

[[-0.5]
 [ 0.5]
 [ 0.5]]

除了ValueError之外。这是因为[[-0.5 -0.5 -0.5] [ 0.5 0.5 0.5] [ 0.5 0.5 0.5]] 改变了形状。

您可以调查原因,但是如果您已经解决了问题,那么您可以将第一列拉出来,labels - sigma(features, weights)在运行error = (labels - sigma(features, weights)).T[0].reshape(3,1)时为您提供相同的解决方案但是您在优化功能。

gradient_log_likelihood(weights, features, labels)