我正在尝试将两个for循环转换为一个apply函数,其中一个for循环希望这会加快我的计算速度。我知道使用apply并不能保证更快的计算,但我想尝试一下(也是为了学习经验以熟悉申请)
我想做的是;
计算两个矩阵的每一行的皮尔逊相关系数,并得到p值。
两个基质的尺寸均为约3000×100
现在我的代码看起来像这样,它已经运行了好几天......
cnt <- 1;
res_row1 <- c();
res_row2 <- c();
res_corr <- c();
res_pval <- c();
for (i in (1:dim(m1)[1])) {
for (j in (1:dim(m2)[1])) {
c <- cor.test(as.matrix(m1[i,]), as.matrix(m2[j,]));
res_row1[cnt] <- rownames(m1)[i];
# need both row names in the output files
res_row2[cnt] <- rownames(m2)[j];
res_corr[cnt] <- c$estimate;
res_pval[cnt] <- c$p.value;
# Storing the results for output
cnt<-cnt+1;
}
comp <- (i / dim(m1[1]) * 100;
cat(sprintf("Row number of file 1 = %f | %f percent complete \n", i, comp))
}
results <- cbind(res_row1, res_row2, res_corr, res_pval)
你能帮助我吗?
答案 0 :(得分:1)
查看cor
的手册:
如果 'x'和'y'是矩阵,然后是协方差(或相关) 在'x'列和'y'列之间计算。
所以,我想尝试一下:
cor(t(m1), t(m2))
对于p值,尝试使用双apply
函数:
R > x <- matrix(rnorm(12), 4, 3)
R > y <- matrix(rnorm(12), 4, 3)
R > cor(t(x), t(y))
[,1] [,2] [,3] [,4]
[1,] 0.9364 0.8474 -0.7131 0.67342
[2,] -0.9539 -0.9946 0.9936 -0.07541
[3,] 0.8013 0.9046 -0.9752 -0.25822
[4,] 0.3767 0.5541 -0.7205 -0.72040
R > t(apply(x, 1, function(a) apply(y, 1, function(b) cor(b, a))))
[,1] [,2] [,3] [,4]
[1,] 0.9364 0.8474 -0.7131 0.67342
[2,] -0.9539 -0.9946 0.9936 -0.07541
[3,] 0.8013 0.9046 -0.9752 -0.25822
[4,] 0.3767 0.5541 -0.7205 -0.72040
R > t(apply(x, 1, function(a) apply(y, 1, function(b) cor.test(b, a)$p.value)))
[,1] [,2] [,3] [,4]
[1,] 0.2283 0.35628 0.49461 0.5297
[2,] 0.1940 0.06602 0.07231 0.9519
[3,] 0.4083 0.28034 0.14201 0.8337
[4,] 0.7541 0.62615 0.48782 0.4879
R > cor.test(x[1,], y[1,])$p.value
[1] 0.2283
R > cor.test(x[1,], y[2,])$p.value
[1] 0.3563