如何在python中计算正态分布百分点函数

时间:2016-12-27 03:08:41

标签: python numpy math scipy statistics

如何在不使用Scipy的情况下执行相当于scipy.stats.norm.ppf的操作。我有python的Math模块内置erf但我似乎无法重新创建该函数。

PS:我不能只使用scipy,因为Heroku不允许你安装它,并且使用备用buildpacks会破坏300Mb的最大段塞大小限制。

2 个答案:

答案 0 :(得分:6)

使用erf来实现norm.ppf并不是一种简单的方法,因为norm.ppferf相关。相反,这里是scipy代码的纯Python实现。您应该会发现函数ndtri返回与norm.ppf完全相同的值:

import math

s2pi = 2.50662827463100050242E0

P0 = [
    -5.99633501014107895267E1,
    9.80010754185999661536E1,
    -5.66762857469070293439E1,
    1.39312609387279679503E1,
    -1.23916583867381258016E0,
]

Q0 = [
    1,
    1.95448858338141759834E0,
    4.67627912898881538453E0,
    8.63602421390890590575E1,
    -2.25462687854119370527E2,
    2.00260212380060660359E2,
    -8.20372256168333339912E1,
    1.59056225126211695515E1,
    -1.18331621121330003142E0,
]

P1 = [
    4.05544892305962419923E0,
    3.15251094599893866154E1,
    5.71628192246421288162E1,
    4.40805073893200834700E1,
    1.46849561928858024014E1,
    2.18663306850790267539E0,
    -1.40256079171354495875E-1,
    -3.50424626827848203418E-2,
    -8.57456785154685413611E-4,
]

Q1 = [
    1,
    1.57799883256466749731E1,
    4.53907635128879210584E1,
    4.13172038254672030440E1,
    1.50425385692907503408E1,
    2.50464946208309415979E0,
    -1.42182922854787788574E-1,
    -3.80806407691578277194E-2,
    -9.33259480895457427372E-4,
]

P2 = [
    3.23774891776946035970E0,
    6.91522889068984211695E0,
    3.93881025292474443415E0,
    1.33303460815807542389E0,
    2.01485389549179081538E-1,
    1.23716634817820021358E-2,
    3.01581553508235416007E-4,
    2.65806974686737550832E-6,
    6.23974539184983293730E-9,
]

Q2 = [
    1,
    6.02427039364742014255E0,
    3.67983563856160859403E0,
    1.37702099489081330271E0,
    2.16236993594496635890E-1,
    1.34204006088543189037E-2,
    3.28014464682127739104E-4,
    2.89247864745380683936E-6,
    6.79019408009981274425E-9,
]

def ndtri(y0):
    if y0 <= 0 or y0 >= 1:
        raise ValueError("ndtri(x) needs 0 < x < 1")
    negate = True
    y = y0
    if y > 1.0 - 0.13533528323661269189:
        y = 1.0 - y
        negate = False

    if y > 0.13533528323661269189:
        y = y - 0.5
        y2 = y * y
        x = y + y * (y2 * polevl(y2, P0) / polevl(y2, Q0))
        x = x * s2pi
        return x

    x = math.sqrt(-2.0 * math.log(y))
    x0 = x - math.log(x) / x

    z = 1.0 / x
    if x < 8.0:
        x1 = z * polevl(z, P1) / polevl(z, Q1)
    else:
        x1 = z * polevl(z, P2) / polevl(z, Q2)
    x = x0 - x1
    if negate:
        x = -x
    return x

def polevl(x, coef):
    accum = 0
    for c in coef:
        accum = x * accum + c
    return accum

答案 1 :(得分:4)

函数ppf是y =(1 + erf(x / sqrt(2))/ 2的倒数。所以我们需要求解x的这个等式,给定y在0和1之间。这是一个执行此操作的代码通过bisection method。我导入了SciPy函数来说明结果是一样的。

from math import erf, sqrt
from scipy.stats import norm         # only for comparison
y = 0.123  

z = 2*y-1
a = 0
while erf(a) > z or erf(a+1) < z:    # looking for initial bracket of size 1
    if erf(a) > z:
        a -= 1
    else:
        a += 1
b = a+1                              # found a bracket, proceed to refine it
while b-a > 1e-15:                   # 1e-15 ought to be enough precision 
   c = (a+b)/2.0                     # bisection method
   if erf(c) > z:
       b = c
   else:
       a = c

print sqrt(2)*(a+b)/2.0              # this is the answer 
print norm.ppf(y)                    # SciPy for comparison

留给你做:

  • 初步约束检查(y必须介于0和1之间)
  • 如果需要其他均值/方差,则缩放和移位;代码用于标准正态分布(均值0,方差1)。