我正在尝试计算两个概率分布之间的Kullback Leibler divergence。为此,我需要执行此积分。
这是我目前无法使用的简化代码:
from scipy.integrate import quad
import numpy as np
def f(x):
return sum([ps[idx]*lambdas[idx]*np.exp(- lambdas[idx] * x) for idx in range(len(ps))])
def g(x):
return scipy.stats.weibull_min.pdf(x, c=c)
c = 0.9
ps = [1]
lambdas = [1]
eps = 0.001 # weibull_min is only defined for x > 0
print(quad(lambda x: f(x) * np.log(f(x) / g(x)), eps, np.inf)) # Output should be greater than 0
这给出了:
(nan, nan)
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: RuntimeWarning: divide by zero encountered in log
# This is added back by InteractiveShellApp.init_path()
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: RuntimeWarning: invalid value encountered in double_scalars
# This is added back by InteractiveShellApp.init_path()
/home/user/.local/lib/python3.5/site-packages/ipykernel_launcher.py:11: IntegrationWarning: The occurrence of roundoff error is detected, which prevents
the requested tolerance from being achieved. The error may be
underestimated.
# This is added back by InteractiveShellApp.init_path()
为什么不起作用,如何使它起作用?
答案 0 :(得分:1)
问题在于f(x)/g(x)
趋于零,并可能导致数值错误。由于整个被积数趋于迅速趋于零,因此您可以简单地在有限范围内积分(例如[0.001,20]),并且仍然可以精确地估计积分:
from scipy.stats import weibull_min
from scipy.integrate import quad
import numpy as np
c = 0.9
ps = [1]
lambdas = [1]
def f(x):
return sum([ps[idx]*lambdas[idx]*np.exp(- lambdas[idx] * x) for idx in range(len(ps))])
def g(x):
return scipy.stats.weibull_min.pdf(x, c=c)
print(scipy.integrate.quad(lambda x: f(x) * np.log(f(x) / g(x)), 0.001, 30))
我没有对精度进行数值分析,但是根据与Mathematica结果的比较,它精确到小数点后9位。这是Mathematica中的测试代码(为您的参数简化了):
f[x_] := Exp[-x];
c = 0.9;
g[x_] := c*x^(c - 1)*Exp[-x^c];
SetPrecision[Integrate[f[x]*Log[f[x]/g[x]], {x, 0.001, \[Infinity]}],20]
Mathematica结果:0.010089328699390866240
研究结果:0.01008932870010536