假设我有一个F值和相关的自由度,df1和df2。如何使用python以编程方式计算与这些数字相关的p值?
注意:我不接受使用scipy或statsmodels的解决方案。
答案 0 :(得分:4)
F分布的CDF(以及因此p值)可以使用正则化(不完整)β函数I(x; a, b)
来计算,参见例如MathWorld。使用仅使用I(x; a, b)
的此blog中的math
代码,p值为
1 - incompbeta(.5*df1, .5*df2, float(df1)*F/(df1*F+df2))
这里是一些样本值的结果,匹配scipy.stats.f.sf
:
In [57]: F, df1, df2 = 5, 20, 18
In [58]: 1 - incompbeta(.5*df1, .5*df2, float(df1)*F/(df1*F+df2))
Out[58]: 0.0005812207389501722
In [59]: st.f.sf(F, df1, df2)
Out[59]: 0.00058122073922042188
以防万一博客消失,这里是代码:
import math
def incompbeta(a, b, x):
''' incompbeta(a,b,x) evaluates incomplete beta function, here a, b > 0 and 0 <= x <= 1. This function requires contfractbeta(a,b,x, ITMAX = 200)
(Code translated from: Numerical Recipes in C.)'''
if (x == 0):
return 0;
elif (x == 1):
return 1;
else:
lbeta = math.lgamma(a+b) - math.lgamma(a) - math.lgamma(b) + a * math.log(x) + b * math.log(1-x)
if (x < (a+1) / (a+b+2)):
return math.exp(lbeta) * contfractbeta(a, b, x) / a;
else:
return 1 - math.exp(lbeta) * contfractbeta(b, a, 1-x) / b;
def contfractbeta(a,b,x, ITMAX = 200):
""" contfractbeta() evaluates the continued fraction form of the incomplete Beta function; incompbeta().
(Code translated from: Numerical Recipes in C.)"""
EPS = 3.0e-7
bm = az = am = 1.0
qab = a+b
qap = a+1.0
qam = a-1.0
bz = 1.0-qab*x/qap
for i in range(ITMAX+1):
em = float(i+1)
tem = em + em
d = em*(b-em)*x/((qam+tem)*(a+tem))
ap = az + d*am
bp = bz+d*bm
d = -(a+em)*(qab+em)*x/((qap+tem)*(a+tem))
app = ap+d*az
bpp = bp+d*bz
aold = az
am = ap/bpp
bm = bp/bpp
az = app/bpp
bz = 1.0
if (abs(az-aold)<(EPS*abs(az))):
return az
print 'a or b too large or given ITMAX too small for computing incomplete beta function.'