从头开始在python中进行一个样本t测试 - 得到不正确的p值

时间:2017-07-03 19:34:32

标签: python numpy scipy statistics

我正在尝试从头开始在python中重新创建一个双尾一个样本t-test以加深我的理解,我似乎有一些数据样本的工作代码,但我发现一个例子与输出不匹配来自scipy.ttest_1samp,我正在试图找出原因。

t统计匹配,但我得到不同的p值。我的t.cdf函数是否有问题导致错误的p值?

我的代码:

sample = [10.81261135, 9.68035252, 9.87293556,  10.06308861,
        9.57381722, 10.00922156, 10.90522431, 9.70843104,
        10.16614481, 10.09447189, 10.51260742, 10.17503686,
        10.38718472, 10.52334431, 9.55808306, 10.24290938,
        10.6048062 , 10.27535938, 9.6329808 ,  9.67338239]
mu = 7.128061097    
sam_mean = np.mean(sample)
sam_std = np.std(sample, ddof=1)
n = len(sample)
df = n-1
t = (sam_mean-mu) / (sam_std / (n**(1/2.)))
p = (scs.t.cdf(t,df))*2
return (t,p)

我的结果:

(32.369715406889142, 2.0)

scipy.ttest_1samp的结果:

Ttest_1sampResult(statistic=32.369715406889142, pvalue=4.3828444145707213e-18)

1 个答案:

答案 0 :(得分:1)

替换

p = (scs.t.cdf(t,df))*2

p = (scs.t.sf(abs(t),df))*2

p = min(scs.t.cdf(t,df), scs.t.sf(t, df))*2

t.sf(x, df)survival function(即1 - t.cdf(x, df))。