我正在演练中计算两个变量之间差异的置信区间,但是我很困惑。我的结果与两个统计网站(here和here)一致,但与R和SPSS均不一致。有人可以帮忙解释一下吗?这是我的代码:
def SSE(x):
"""Returns the sum of squared errors of x"""
return sum([(i - np.mean(x))**2 for i in x])
def confint_diff(x, y, level=.95):
alpha = 1-level
dof = len(x) + len(y) - 2
if len(x) == len(y):
n = len(x)
mean_variance = (np.var(x, ddof=1) + np.var(y, ddof=1)) / 2
stderr_diffs = np.sqrt((2 * mean_variance) / n) #stats.sem(x-y) ?
else:
mean_variance = (SSE(x) + SSE(y)) / dof
stderr_diffs = np.sqrt((2 * mean_variance) / stats.hmean((len(x), len(y)))) #stats.sem(x-y) ?
# The critcal t value for specified confidence level given the dof
t_cl = stats.t.ppf([level + (alpha/2)], dof).item()
return (np.mean(x-y) - t_cl * stderr_diffs, np.mean(x-y) + t_cl * stderr_diffs)
a, b = np.array([2,4,6,8]), np.array([1,5,4,3])
print(confint_diff(a, b))
(-2.037447534048241, 5.5374475340482405)
在R中:
> a <- c(2,4,6,8)
> b <- c(1,5,4,3)
> t.test(a-b)
One Sample t-test
data: a - b
t = 1.4, df = 3, p-value = 0.256
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-2.228058 5.728058
sample estimates:
mean of x
1.75
和SPSS显示的结果与R相同。有人可以建议为什么值不同吗?