我正在尝试使用来自多个数据子集的复制相关性测试的p.values生成一组数据。我需要对子集进行采样,然后执行相关性测试并至少进行10000次。
数据看起来像这样(对于超出代码的抱怨,只是我知道的唯一方式):
genus<-c(1,2,3,4,5,6)
Treatment<-c(rep("A",3),rep("B",3))
Hit<-c(10,15,2,3,6,7)
val<-c(2,4,5,6,2,3,4,7,8,9,6,7)
target<-c(rep("target1",6),rep("target2",6))
Data<-data.frame(Genus=rep(genus,2),Treatment=rep(Treatment,2),Hit=rep(Hit,2),target,Response=val)
Data
知道我想出了这个代码,以便为变量生成一系列相关性,通过采样生成响应和命中,并根据变量Target对数据进行子设置:
res1=ddply(Data, .(Data$target), function(sub_data)
{
cor_result = cor.test(sub_data$Hit, sample(sub_data$Response,6), method="k")
perm=replicate(2, cor_result)
return(data.frame(perm))
}
)
res1
从中我获得
res1
Data$target X1
1 target1 0
2 target1 NULL
3 target1 1
4 target1 0
5 target1 0
6 target1 two.sided
7 target1 Kendall's rank correlation tau
8 target1 sub_data$Hit and sample(sub_data$Response, 6)
9 target2 1.147638
10 target2 NULL
11 target2 0.251118
12 target2 0.4140393
13 target2 0
14 target2 two.sided
15 target2 Kendall's rank correlation tau
16 target2 sub_data$Hit and sample(sub_data$Response, 6)
X2
1 0
2 NULL
3 1
4 0
5 0
6 two.sided
7 Kendall's rank correlation tau
8 sub_data$Hit and sample(sub_data$Response, 6)
9 1.147638
10 NULL
11 0.251118
12 0.4140393
13 0
14 two.sided
15 Kendall's rank correlation tau
16 sub_data$Hit and sample(sub_data$Response, 6)
但是我希望每次重复对采样数据进行相关性测试时都只有p值,并得到如下所示的内容:
Data$target p.value Test
1 target1 0.1259712 1
2 target2 0.1259712 1
5 target1 1.0000000 2
6 target2 1.0000000 2
最后的想法是针对一组23000个目标进行操作,并至少置换测试1000,以便我可以选择最小p.values的集合并具有更正的p.value。
非常感谢!