Question

我正在尝试计算两个概率分布之间的Kullback Leibler divergence。为此，我需要执行此积分。

这是我目前无法使用的简化代码：

from scipy.integrate import quad
import numpy as np

def f(x):
    return sum([ps[idx]*lambdas[idx]*np.exp(- lambdas[idx] * x) for idx in range(len(ps))])
def g(x):
    return scipy.stats.weibull_min.pdf(x, c=c)
c = 0.9
ps = [1]
lambdas = [1]
eps = 0.001  # weibull_min is only defined for x > 0
print(quad(lambda x: f(x) * np.log(f(x) / g(x)), eps, np.inf)) # Output should be greater than 0

这给出了：

(nan, nan)
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: RuntimeWarning: divide by zero encountered in log
  # This is added back by InteractiveShellApp.init_path()
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: RuntimeWarning: invalid value encountered in double_scalars
  # This is added back by InteractiveShellApp.init_path()
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: IntegrationWarning: The occurrence of roundoff error is detected, which prevents 
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  # This is added back by InteractiveShellApp.init_path()

为什么不起作用，如何使它起作用？

Answer 1

问题在于f(x)/g(x)趋于零，并可能导致数值错误。由于整个被积数趋于迅速趋于零，因此您可以简单地在有限范围内积分（例如[0.001，20]），并且仍然可以精确地估计积分：

from scipy.stats import weibull_min
from scipy.integrate import quad
import numpy as np

c = 0.9
ps = [1]
lambdas = [1]
def f(x):
    return sum([ps[idx]*lambdas[idx]*np.exp(- lambdas[idx] * x) for idx in range(len(ps))])
def g(x):
    return scipy.stats.weibull_min.pdf(x, c=c)
print(scipy.integrate.quad(lambda x: f(x) * np.log(f(x) / g(x)), 0.001, 30))

我没有对精度进行数值分析，但是根据与Mathematica结果的比较，它精确到小数点后9位。这是Mathematica中的测试代码（为您的参数简化了）：

f[x_] := Exp[-x];
c = 0.9;
g[x_] := c*x^(c - 1)*Exp[-x^c];
SetPrecision[Integrate[f[x]*Log[f[x]/g[x]], {x, 0.001, \[Infinity]}],20]

Mathematica结果：0.010089328699390866240
研究结果：0.01008932870010536

为什么scipy.integrate.quad对于此积分失败？

1 个答案: