我需要在R中编写一个自己的测试,借助于给定的随机变量X和Y的给定测试统计量的平均值,这些随机变量X和Y是未知的分布式。
我收到以下代码:
mean.test <- function(x, y, B=10000,
alternative=c("two.sided","less","greater"))
{
p.value <- 0
alternative <- match.arg(alternative)
s<-replicate(B, (mean(sample(c(x,y), B, replace=TRUE))-mean(sample(c(x,y), B, replace=TRUE)))) # random samples of test statistics
t <- mean(x) - mean(y) #teststatistics t
p.value <- 2 * (1- pnorm(mean(s))) #try to calculate p value
data.name <- deparse(substitute(c(x,y)))
names(t) <- "difference in means"
zero <- 0
names(zero) <- "difference in means"
return(structure(list(statistic = t, p.value = p.value,
method = "mean test", data.name = data.name,
observed = c(x,y), alternative = alternative,
null.value = zero),
class = "htest"))
}
其中t是变量X和Y的随机变量的平均值。我给了一些函数调用的解决方案,但我从来没有得到它们。
例如:
set.seed(0)
mean.test(rnorm(100,50,4),rnorm(100,51,5),alternative="less")
应输出:
mean test
data: c(rnorm(100, 50, 4), rnorm(100, 51, 5))
difference in means = -2.0224, p-value = 0.0011
alternative hypothesis: true difference in means is less than 0
但它输出:
mean test
data: c(rnorm(100, 50, 4), rnorm(100, 51, 5))
difference in means = -0.68157, p-value = 1
alternative hypothesis: true difference in means is less than 0
我确信我正以错误的方式计算p值。此示例中相互减去的平均值也是错误的,但对于该练习的其他示例则是正确的。我真的很困惑如何计算p值。我该如何计算?