计算python中的梯度错误?

时间:2018-01-26 12:00:29

标签: python arrays numpy

我正在尝试按照这里的课程http://cs231n.github.io/optimization-1/,在 以数字方式计算渐变 的部分中,他们提供了一个应该的代码片段计算机给出函数和数组的渐变。我尝试使用我自己的函数和numpy数组作为输入来运行它,我得到以下错误:

docker run -d --volume /mydata --name elastic-shared alpine echo My Data Container
docker run -d --volumes-from elastic-shared --name myelastic elasticsearch:latest

我理解错误是因为它无法为一个序列指定grad [ix],我也尝试使用列数组并得到相同的错误。

以下是代码:

ValueError                                Traceback (most recent call last)
<ipython-input-18-31c1f1d6169c> in <module>()
      2     return a
      3 
----> 4 eval_numerical_gradient(f,np.array([1,2,3,4,5]))

<ipython-input-12-d6bea4220895> in eval_numerical_gradient(f, x)
     28     print(x[ix])
     29     # compute the partial derivative
---> 30     grad[ix] = (fxh - fx) / h # the slope
     31     it.iternext() # step to next dimension
     32 

ValueError: setting an array element with a sequence.

我的问题是 :我输入的numpy数组(行和列)错了吗?有人可以解释为什么会这样吗?

示例输入:

def eval_numerical_gradient(f, x):
  """ 
  a naive implementation of numerical gradient of f at x 
  - f should be a function that takes a single argument
  - x is the point (numpy array) to evaluate the gradient at
  """ 

  fx = f(x) # evaluate function value at original point
  print(x)
  print(fx)
  grad = np.zeros(x.shape)
  h = 0.00001

  # iterate over all indexes in x
  it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
  while not it.finished:
    print(it)
    # evaluate function at x+h
    ix = it.multi_index
    print(ix)
    old_value = x[ix]
    print(old_value)
    x[ix] = old_value + h # increment by h
    print(x)
    fxh = f(x) # evalute f(x + h)
    print(fxh)
    x[ix] = old_value # restore to previous value (very important!)
    print(x[ix])
    # compute the partial derivative
    grad[ix] = (fxh - fx) / h # the slope
    it.iternext() # step to next dimension

  return grad

def f(a):
    return a

eval_numerical_gradient(f,np.array([[1],[2],[3]]))

1 个答案:

答案 0 :(得分:2)

我建议对eval_numerical_gradient(f, x)进行以下修复:

  • 第25行:将fxh = f(x)替换为fxh = f(x[ix])
  • 第30行:将grad[ix] = (fxh - fx) / h替换为grad[ix] = (fxh - fx[ix]) / h

并使用浮点数条目输入输入矩阵,例如,

eval_numerical_gradient(f,np.array([[1],[2],[3]], dtype=np.float))