复制相关性测试的最小值

时间:2012-11-16 11:40:39

标签: r correlation

我正在尝试使用来自多个数据子集的复制相关性测试的p.values生成一组数据。我需要对子集进行采样,然后执行相关性测试并至少进行10000次。

数据看起来像这样(对于超出代码的抱怨,只是我知道的唯一方式):

genus<-c(1,2,3,4,5,6)
Treatment<-c(rep("A",3),rep("B",3))
Hit<-c(10,15,2,3,6,7)
val<-c(2,4,5,6,2,3,4,7,8,9,6,7)
target<-c(rep("target1",6),rep("target2",6))
Data<-data.frame(Genus=rep(genus,2),Treatment=rep(Treatment,2),Hit=rep(Hit,2),target,Response=val)
Data

知道我想出了这个代码,以便为变量生成一系列相关性,通过采样生成响应和命中,并根据变量Target对数据进行子设置:

res1=ddply(Data, .(Data$target), function(sub_data) 
{
cor_result = cor.test(sub_data$Hit, sample(sub_data$Response,6), method="k")  
perm=replicate(2, cor_result)
return(data.frame(perm))
}
)
res1

从中我获得

res1
   Data$target                                            X1
1      target1                                             0
2      target1                                          NULL
3      target1                                             1
4      target1                                             0
5      target1                                             0
6      target1                                     two.sided
7      target1                Kendall's rank correlation tau
8      target1 sub_data$Hit and sample(sub_data$Response, 6)
9      target2                                      1.147638
10     target2                                          NULL
11     target2                                      0.251118
12     target2                                     0.4140393
13     target2                                             0
14     target2                                     two.sided
15     target2                Kendall's rank correlation tau
16     target2 sub_data$Hit and sample(sub_data$Response, 6)
                                              X2
1                                              0
2                                           NULL
3                                              1
4                                              0
5                                              0
6                                      two.sided
7                 Kendall's rank correlation tau
8  sub_data$Hit and sample(sub_data$Response, 6)
9                                       1.147638
10                                          NULL
11                                      0.251118
12                                     0.4140393
13                                             0
14                                     two.sided
15                Kendall's rank correlation tau
16 sub_data$Hit and sample(sub_data$Response, 6)

但是我希望每次重复对采样数据进行相关性测试时都只有p值,并得到如下所示的内容:

Data$target   p.value    Test
1 target1     0.1259712   1
2 target2     0.1259712   1
5 target1     1.0000000   2
6 target2     1.0000000   2

最后的想法是针对一组23000个目标进行操作,并至少置换测试1000,以便我可以选择最小p.values的集合并具有更正的p.value。

非常感谢!

0 个答案:

没有答案