Question

我使用下面的公式作为我的假设： hypothesis

以下公式作为成本函数： cost function for one sample

所以我尝试最小化的对象函数是： object function

渐变是： gradient

csv文件的格式如下： Y0，X1，X2，X3，... Y1，X1，X2，X3，... Y2，X1，X2，X3，... y是1或0（用于分类）培训代码如下：

import numpy as np
import scipy as sp
from scipy.optimize import fmin_bfgs
import pylab as pl



data = np.genfromtxt('../data/small_train.txt', delimiter=',')
y = data[:,0]
#add 1 as the first column of x, the constant term
x = np.append(np.ones((len(y), 1)), data[:,1:], axis = 1)

#sigmoid hypothesis
def h(theta, x):
    return 1.0/(1+np.exp(-np.dot(theta, x)))

#cost function
def cost(theta, x, y):
    tot = 0
    for i in range(len(y)):
        tot += y[i]*np.log(h(theta, x[i])) + (1-y[i])*(1-np.log(h(theta, x[i])))
    return -tot / len(y)

#gradient

def deviation(theta, x, y):
    def f(theta, x, y, j):
        tot = 0.0
        for i in range(len(y)):
            tot += (h(theta, x[i]) - y[i]) * x[i][j]
        return tot / len(y)
    ret = []
    for j in range(len(x[0])):
        ret.append(f(theta, x, y, j))
    return np.array(ret)


init_theta = np.zeros(len(x[0]))
ret = fmin_bfgs(cost, init_theta, fprime = deviation, args=(x,y))
print ret

我在一个小数据集上运行代码，但似乎我的实现不对。可以帮助我吗？还有一个问题：如你所知，fmin_bfgs不一定需要fprime术语，如果我们提供它并且不提供它之间的区别是什么？

Answer 1

我想在上面的代码中更正一些内容。

我认为成本函数应如下（校正以粗体显示）：

#cost function
def cost(theta, x, y):
    tot = 0
    for i in range(len(y)):
        tot += y[i]*np.log(h(theta, x[i])) + (1-y[i])*(**np.log(1-h(theta, x[i]**)))
    return -tot / len(y)

如果这样更好，请告诉我，非常感谢！

使用fmin_bfgs进行逻辑回归的python scipy

1 个答案: