Python中t检验的置信区间(均值之间的差异)

时间:2015-08-02 04:06:58

标签: python statistics hypothesis-test

我正在寻找一种快速的方法来获得Python中的t检验置信区间,以获得均值之间的差异。与R中的相似:

{'class_': 'template_title', 'valign': 'bottom', 'width': '535', 'height': '50'}

输出:

X1 <- rnorm(n = 10, mean = 50, sd = 10)
X2 <- rnorm(n = 200, mean = 35, sd = 14)
# the scenario is similar to my data

t_res <- t.test(X1, X2, alternative = 'two.sided', var.equal = FALSE)    
t_res

下一步:

    Welch Two Sample t-test

data:  X1 and X2
t = 1.6585, df = 10.036, p-value = 0.1281
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.539749 17.355816
sample estimates:
mean of x mean of y 
 43.20514  35.79711 
考虑到假设检验中重要性区间的重要性(以及最近只报告p值的实践有多少批评),我在statsmodel或scipy中找不到任何类似的东西,这很奇怪,这是多少批评。

1 个答案:

答案 0 :(得分:21)

这里如何使用StatsModels'CompareMeans来计算平均值之间差异的置信区间:

import numpy as np, statsmodels.stats.api as sms

X1, X2 = np.arange(10,21), np.arange(20,26.5,.5)

cm = sms.CompareMeans(sms.DescrStatsW(X1), sms.DescrStatsW(X2))
print cm.tconfint_diff(usevar='unequal')

输出

(-10.414599391793885, -5.5854006082061138)

并匹配R:

> X1 <- seq(10,20)
> X2 <- seq(20,26,.5)
> t.test(X1, X2)

    Welch Two Sample t-test

data:  X1 and X2
t = -7.0391, df = 15.58, p-value = 3.247e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.414599  -5.585401
sample estimates:
mean of x mean of y 
       15        23