Question

我在x_data中有一个3x2000 numpy数组，在y_data中有一个1x2000 numpy数组，我传递给这个函数回归给我一个回归线。它工作正常。问题是，我正在尝试进行一些回测并测试1000种情况，我必须回归1000次，这将花费我大约5分钟来运行它。

我尝试标准化它似乎没有使它更快的变量。

我还简要地尝试了fmin_powell和fmin_bfgs，这似乎打破了它。

任何想法？谢谢！

def regress(x_data, y_data, fg_spread, fg_line):

    theta = np.matrix(np.ones((1,x_data.shape[0]))*.11)
    hyp = lambda theta, x: 1 / (1 + np.exp(-(theta*x)))
    cost_hyp = lambda theta, x, y: ((np.multiply(-y,np.log10(hyp(theta,x)))) - \
                            (np.multiply((1-y),(np.log10(1-hyp(theta, x)))))).sum()

    theta = scipy.optimize.fmin(cost_hyp, theta, args=(x_data,y_data), xtol=.00001, disp=0)

    return hyp(np.matrix(theta),np.matrix([1,fg_spread, fg_line]).reshape(3,1))

Answer 1

使用numexpr使您的hyp和cost_hyp计算更快地进行评估。 fmin函数族为不同的条目多次计算这些函数。因此，在最小化中直接报告这些函数的任何增益。

例如，您将替换：

hyp = lambda theta, x: 1 / (1 + np.exp(-(theta*x)))

由：

hyp = lambda theta, x: numexpr.evaluate("1 / (1 + exp(-(theta*x)))")

Numexpr旨在使用numpy数组。

Python fmin太慢了

1 个答案: