如何在scipy.optimize

时间:2016-08-08 19:33:16

标签: python optimization machine-learning scipy gradient

我一直在尝试使用fmin_cg来最小化Logistic回归的成本函数。

xopt = fmin_cg(costFn, fprime=grad, x0= initial_theta, 
                                 args = (X, y, m), maxiter = 400, disp = True, full_output = True )

这就是我调用我的fmin_cg

的方式

这是我的CostFn:

def costFn(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 0
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J.flatten()

这是我的毕业生:

def grad(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg = 1 / m * (X.T.dot(h-y))
    return gg.flatten()

似乎抛出了这个错误:

/Users/sugethakch/miniconda2/lib/python2.7/site-packages/scipy/optimize/linesearch.pyc in phi(s)
     85     def phi(s):
     86         fc[0] += 1
---> 87         return f(xk + s*pk, *args)
     88 
     89     def derphi(s):

ValueError: operands could not be broadcast together with shapes (3,) (300,) 

我知道这与我的尺寸有关。但我似乎无法弄明白。 我是菜鸟,所以我可能犯了一个明显的错误。

我已阅读此链接:

fmin_cg: Desired error not necessarily achieved due to precision loss

但是,它似乎对我不起作用。

任何帮助?

更新了X,y,m,theta的大小

(100,3)----> X

(100,1)-----> ÿ

100 ---->米

(3,1)----> THETA

这是我初始化X,y,m:

的方法
data = pd.read_csv('ex2data1.txt', sep=",", header=None)                        
data.columns = ['x1', 'x2', 'y']                                                       
x1 = data.iloc[:, 0].values[:, None]                                                     
x2 = data.iloc[:, 1].values[:, None]                                                    
y = data.iloc[:, 2].values[:, None]
# join x1 and x2 to make one array of X
X = np.concatenate((x1, x2), axis=1)
m, n = X.shape

ex2data1.txt:

34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
.....

如果有帮助,我正在尝试重新编写Andrew Ng在Python中为Coursera的ML课程做的一项家庭作业

2 个答案:

答案 0 :(得分:1)

最后,我弄清楚我的初始计划中的问题是什么。

我的' y'是(100,1),fmin_cg期望(100,)。一旦我压扁了我的“#”;它不再抛出初始错误。但是,优化仍然没有奏效。

 Warning: Desired error not necessarily achieved due to precision loss.
     Current function value: 0.693147
     Iterations: 0
     Function evaluations: 43
     Gradient evaluations: 41

这与我没有优化的结果相同。

我发现优化这种方法的方法是使用Nelder-Mead'方法。我按照这个答案:scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"

Result = op.minimize(fun = costFn, 
                x0 = initial_theta, 
                args = (X, y, m),
                method = 'Nelder-Mead',
                options={'disp': True})#,
                #jac = grad)

这种方法不需要雅可比人。 我得到了我想要的结果,

Optimization terminated successfully.
     Current function value: 0.203498
     Iterations: 157
     Function evaluations: 287

答案 1 :(得分:0)

好吧,因为我不确切地知道你的初始化mXytheta我必须做出一些假设。希望我的回答是相关的:

import numpy as np
from scipy.optimize import fmin_cg
from scipy.special import expit

def costFn(theta, X, y, m):
    # expit is the same as sigmoid, but faster
    h = expit(X.dot(theta))

    # instead of 1/m, I take the mean
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J #should be a scalar


def grad(theta, X, y, m):
    h = expit(X.dot(theta))
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg =  (X.T.dot(h-y))    
    return gg.flatten()

# initialize matrices
X = np.random.randn(100,3)
y = np.random.randn(100,) #this apparently needs to be a 1-d vector
m = np.ones((3,)) # not using m, used np.mean for a weighted sum (see ali_m's comment)
theta = np.ones((3,1))

xopt = fmin_cg(costFn, fprime=grad, x0=theta, args=(X, y, m), maxiter=400, disp=True, full_output=True )

代码运行时,我不太了解您的问题,知道这是否是您正在寻找的。但希望这可以帮助您更好地理解问题。检查答案的一种方法是使用fmin_cg致电fprime=None并查看答案的比较方式。