子类化rv_continuous

时间:2018-06-19 15:14:49

标签: python-3.x

import scipy.stats as st
import numpy as np  # generic math functions


# https://scicomp.stackexchange.com/q/1658
class LorentzGen(st.rv_continuous):
    """Lorentz distribution"""

    def _pdf(self, x):
        gamma = 0.27
        return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))


transverse_fields = LorentzGen(a=0)
gaussian_gen = st.norm()

L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # OverflowError: (34, 'Result too large')
        # h_y = [[-gaussian_gen.rvs(), xx] for xx in range(L)]  # Works

在上面的代码中,我实现了自己的概率分布(实际上是一半的洛伦兹分布,x∈[0,∞]),该分布是根据scicomp.SE的答案(我称为transverse_fields)进行修改的。

我需要从此transverse_fields生成一堆值,并在嵌套的For循环中使用它们。问题是,超出一定数量的运行次数(这里是“ Run 1 Trial〜3500”),我遇到了很多错误:

C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py:385: IntegrationWarning: The integral is probably divergent, or slowly convergent.
  warnings.warn(msg, IntegrationWarning)
C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py:2831: RuntimeWarning: overflow encountered in ? (vectorized)
  outputs = ufunc(*inputs)
Traceback (most recent call last):
  File "C:/<redacted>/stackoverflow.py", line 26, in <module>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:/<redacted>/stackoverflow.py", line 26, in <listcomp>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 954, in rvs
    vals = self._rvs(*args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 889, in _rvs
    Y = self._ppf(U, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 902, in _ppf
    return self._ppfvec(q, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1587, in _ppf_single
    while self._ppf_to_solve(right, q, *args) < 0.:
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1569, in _ppf_to_solve
    return self.cdf(*(x, )+args)-q
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1745, in cdf
    place(output, cond, self._cdf(*goodargs))
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1621, in _cdf
    return self._cdfvec(x, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1618, in _cdf_single
    return integrate.quad(self._pdf, self.a, x, args=args)[0]
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 341, in quad
    points)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 448, in _quad
    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
  File "C:/<redacted>/stackoverflow.py", line 11, in _pdf
    return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))
OverflowError: (34, 'Result too large')

Process finished with exit code 1

请注意,如果我将试验次数(此处t增加到较小的次数,如50),则不会发生错误;如果list_of_temps的值较小(例如,小于),也不会发生错误。 np.linspace(1,10,4)即使是原始问题,对于np.linspace(1,10,40),在运行1期间也会弹出错误。

在原始设置下,当我使用scipy.stats中的标准高斯分布函数时也没有溢出错误。

我见过的

Similar issues on SO将其归因于for循环运行的范围太大。但是我在这里不太明白吗?而且,无论如何,我不太了解如何按照链接问题的答案中的建议实施十进制。

我该如何解决?

我正在64位Windows 10上使用Anaconda运行Python 3.6.5。

1 个答案:

答案 0 :(得分:0)

我用transverse_fields / LorentzGen声明概率分布的方式似乎有问题。

我的解决方案是使用cauchy中的内置scipy.stats发行版,并更改比例。 同样,由于我想要一个半洛伦兹式的坐标,因此我从np.abs(...)提取随机数时只使用了绝对的transverse_fields

import scipy.stats as st
import numpy as np  # generic math functions

transverse_fields = st.cauchy(scale=0.27)
L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-np.abs(transverse_fields.rvs()), xx] for xx in range(L)]  # Now works

现在这对我来说已经足够令人满意,但是我仍然会感谢有人解释为什么我的rv_continuous子类化方法给了我上述错误。