Question

我正在尝试获得作为集成结果的expected_W或H函数：

$H(p, \theta_0, \theta_1) = \int_{-\infty}^\infty \int_{-\infty}^\infty w(p, \theta, \epsilon, \beta) f(\beta | \theta) q(\epsilon) \; d \beta \; d \epsilon$

其中：

theta是一个包含两个元素的向量：theta_0和theta_1
f（beta | theta）是β的正常密度，平均值为theta_0，方差为theta_1
q（epsilon）是epsilon的正常密度，平均值为零，方差为sigma_epsilon（默认设置为1）。
w（p，theta，eps，beta）是我作为输入的函数，所以我无法准确预测它的外观。它可能是非线性的，但不是特别讨厌。

这是我实施问题的方式。我确定我制作的包装函数很乱，所以我也乐意接受任何帮助。

from __future__ import division
from scipy import integrate
from scipy.stats import norm
import math
import numpy as np


def exp_w(w_B, sigma_eps = 1, **kwargs):
    '''
    Integrates the w_B function

    Input:
    + w_B : the function to be integrated. 
    + sigma_eps : variance of the epsilon term. Set to 1 by default
    '''

    #The integrand function gives everything under the integral:
    # w(B(p, \theta, \epsilon, \beta)) f(\beta | \theta ) q(\epsilon)
    def integrand(eps, beta, p, theta_0, theta_1, sigma_eps=sigma_eps):
        q_e = norm.pdf(eps, loc=0, scale=math.sqrt(sigma_eps))
        f_beta = norm.pdf(beta, loc=theta_0, scale=math.sqrt(theta_1))

        return w_B(p = p, 
                   theta_0 = theta_0, theta_1 = theta_1,
                   eps = eps, beta=beta)* q_e *f_beta

    #limits of integration. Using limited support for now.
    eps_inf = lambda beta : -10 # otherwise: -np.inf
    eps_sup = lambda beta : 10  # otherwise: np.inf
    beta_inf = -10
    beta_sup = 10

    def integrated_f(p, theta_0, theta_1):
        return integrate.dblquad(integrand, beta_inf, beta_sup,
            eps_inf, eps_sup,
            args = (p, theta_0, theta_1))
    # this integrated_f is the H referenced at the top of the question
    return integrated_f

我用一个简单的w函数测试了这个函数，我知道解析解决方案（通常情况并非如此）。

def test_exp_w():
    def w_B(p, theta_0, theta_1, eps, beta):
        return 3*(p*eps + p*(theta_0 + theta_1) - beta)

    # Function that I get
    integrated = exp_w(w_B, sigma_eps = 1)

    # Function that I should get
    def exp_result(p, theta_0, theta_1):
        return 3*p*(theta_0 + theta_1) - 3*theta_0

    args = np.random.rand(3)
    d_args = {'p' : args[0], 'theta_0' : args[1], 'theta_1' : args[2]}

    if not (np.allclose(
    integrated(**d_args)[0], exp_result(**d_args)) ):
        raise Exception("Integration procedure isn't working!")

因此，我的实现似乎正在起作用，但对我的目的来说这很慢。我需要重复这个过程数十次或数十万次（这是Value函数迭代中的一个步骤。如果人们认为它是相关的，我可以提供更多信息。）

scipy版本0.14.0和numpy版本1.8.1，此积分需要15秒才能计算。

有人对如何解决这个问题有任何建议吗？首先，tt可能有助于获得有限的整合领域，但我还没弄清楚如何做到这一点，或者SciPy中的高斯求积法是否以一种好的方式处理它（它是否使用Gauss-Hermite？）

感谢您的时间。

----编辑：添加分析时间-----

％lprun结果表明大部分时间花费在 _distn_infraestructure.py:1529(pdf)和 _continuous_distns.py:97(_norm_pdf) 每个号码都有83244个号码。

Answer 1

如果功能不是一个令人讨厌的功能，整合你的功能所需的时间听起来很长。

我建议你做的第一件事是分析花费时间的地方。它是在dblquad或其他地方度过的吗？在集成期间向w_B拨打了多少电话？如果时间花费在dblquad并且调用次数非常高，那么您可以在集成中使用更宽松的容差吗？

似乎高斯的乘法实际上使你能够大大限制积分限制，因为高斯的大部分能量都在很小的范围内。您可能想尝试计算合理的更严格的界限。你已经把面积限制在-10..10;在-100..100，-10..10和-1..1之间是否有任何显着的性能变化？

如果您知道您的功能相对平稳，那么集成的米老鼠版本就会出现：

确定两个轴（由高斯人）合理的上限和下限
计算合理的网格密度（例如，每个方向100个点）
计算每个点的w_B（如果可能需要w_B的矢量化版本，这将更快）
总结一下

这是非常低技术但也非常快。它是否为您提供足够好的外部迭代结果是一个有趣的问题。它只是可能。

慢scipy双正交积分

1 个答案: