Question

我一直在尝试使用fmin_cg来最小化Logistic回归的成本函数。

xopt = fmin_cg(costFn, fprime=grad, x0= initial_theta, 
                                 args = (X, y, m), maxiter = 400, disp = True, full_output = True )

这就是我调用我的fmin_cg

的方式

这是我的CostFn：

def costFn(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 0
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J.flatten()

这是我的毕业生：

def grad(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg = 1 / m * (X.T.dot(h-y))
    return gg.flatten()

似乎抛出了这个错误：

/Users/sugethakch/miniconda2/lib/python2.7/site-packages/scipy/optimize/linesearch.pyc in phi(s)
     85     def phi(s):
     86         fc[0] += 1
---> 87         return f(xk + s*pk, *args)
     88 
     89     def derphi(s):

ValueError: operands could not be broadcast together with shapes (3,) (300,)

我知道这与我的尺寸有关。但我似乎无法弄明白。我是菜鸟，所以我可能犯了一个明显的错误。

我已阅读此链接：

fmin_cg: Desired error not necessarily achieved due to precision loss

但是，它似乎对我不起作用。

任何帮助？

更新了X，y，m，theta的大小

（100,3）----＆gt; X

（100,1）-----＆gt; ÿ

100 ----＆gt;米

（3,1）----＆gt; THETA

这是我初始化X，y，m：

的方法

data = pd.read_csv('ex2data1.txt', sep=",", header=None)                        
data.columns = ['x1', 'x2', 'y']                                                       
x1 = data.iloc[:, 0].values[:, None]                                                     
x2 = data.iloc[:, 1].values[:, None]                                                    
y = data.iloc[:, 2].values[:, None]
# join x1 and x2 to make one array of X
X = np.concatenate((x1, x2), axis=1)
m, n = X.shape

ex2data1.txt：

34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
.....

如果有帮助，我正在尝试重新编写Andrew Ng在Python中为Coursera的ML课程做的一项家庭作业

Answer 1

最后，我弄清楚我的初始计划中的问题是什么。

我的＆＃39; y＆＃39;是（100,1），fmin_cg期望（100，）。一旦我压扁了我的“＃”;它不再抛出初始错误。但是，优化仍然没有奏效。

 Warning: Desired error not necessarily achieved due to precision loss.
     Current function value: 0.693147
     Iterations: 0
     Function evaluations: 43
     Gradient evaluations: 41

这与我没有优化的结果相同。

我发现优化这种方法的方法是使用Nelder-Mead＆＃39;方法。我按照这个答案：scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"

Result = op.minimize(fun = costFn, 
                x0 = initial_theta, 
                args = (X, y, m),
                method = 'Nelder-Mead',
                options={'disp': True})#,
                #jac = grad)

这种方法不需要雅可比人。我得到了我想要的结果，

Optimization terminated successfully.
     Current function value: 0.203498
     Iterations: 157
     Function evaluations: 287

Answer 2

好吧，因为我不确切地知道你的初始化m，X，y和theta我必须做出一些假设。希望我的回答是相关的：

import numpy as np
from scipy.optimize import fmin_cg
from scipy.special import expit

def costFn(theta, X, y, m):
    # expit is the same as sigmoid, but faster
    h = expit(X.dot(theta))

    # instead of 1/m, I take the mean
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J #should be a scalar


def grad(theta, X, y, m):
    h = expit(X.dot(theta))
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg =  (X.T.dot(h-y))    
    return gg.flatten()

# initialize matrices
X = np.random.randn(100,3)
y = np.random.randn(100,) #this apparently needs to be a 1-d vector
m = np.ones((3,)) # not using m, used np.mean for a weighted sum (see ali_m's comment)
theta = np.ones((3,1))

xopt = fmin_cg(costFn, fprime=grad, x0=theta, args=(X, y, m), maxiter=400, disp=True, full_output=True )

代码运行时，我不太了解您的问题，知道这是否是您正在寻找的。但希望这可以帮助您更好地理解问题。检查答案的一种方法是使用fmin_cg致电fprime=None并查看答案的比较方式。

如何在scipy.optimize

2 个答案: