获得两组数据之间的所有可能的相关性

时间:2019-01-30 01:57:08

标签: r correlation

我试图在每个相关性之间找到一些信息

corr.test

我有两个数据集df1和df2

df1<- structure(list(col1A = c(1.64, 0.03, 0, 4.202, 2.981, 0.055, 
0, 0.002, 0.005, 0, 0.002, 0.649, 2.55, 2.762, 6.402), col2A = c(2.635, 
0.019, 0, 5.638, 3.542, 0.793, 0.259, 0, 0.046, 0.004, 0.017, 
0.971, 3.81, 3.104, 5.849), col3A = c(0.91, 0.037, 0, 5.757, 
3.916, 0.022, 0, 0, 0.003, 0, 0.262, 0.136, 2.874, 3.466, 5.003
), col4A = c(1.027, 0.021, 0, 4.697, 2.832, 0.038, 0.032, 0.001, 
0.003, 0, 0, 0.317, 2.743, 3.187, 6.455)), class = "data.frame", row.names = c(NA, 
-15L))

第二个数据如下

 df2<-structure(list(col1 = c(2.172, 0, 0, 4.353, 4.581, 0.001, 0.027, 
0, 0.002, 0, 0, 0.087, 2.129, 4.317, 5.849), col2 = c(2.093, 
0, 0, 4.235, 3.166, 0, 0, 0.006, 0.01, 0, 0, 0.475, 0, 2.62, 
5.364), col3 = c(3.322, 0, 0, 4.332, 4.018, 0.049, 0.169, 0.004, 
0.02, 0, 0.032, 1.354, 2.944, 4.323, 5.44), col4 = c(0.928, 0.018, 
0, 3.943, 3.723, 0.02, 0, 0, 0, 0, 0.075, 0.136, 3.982, 3.875, 
5.83)), row.names = c("A", "AA", "AAA", "Aab", "buy", "yuyn", 
"gff", "fgd", "kil", "lilk", "hhk", "lolo", "fdd", "vgfh", "nghg"
), class = "data.frame")

我想获得两者之间所有可能的相关性,并提取所有p值和调整后的p值

我使用

library(psych)
corr.test(df1,df2, use = "pairwise",method="pearson",adjust="holm",alpha=.05,ci=TRUE,minlength=5)

它没有给我任何p值。我也无法控制任何排列来计算调整后的p值。

我正在考虑使用以下内容

x <-df1[,1]
y <-df2[,2] 
corr_init <- cor(x,y) # original correlation
N <- 1000 # initialize number of permutations
count <- 0 # counts correlation greater than corr_init
for (i in 1:N) {
y_perm <- permute(y)
  if (cor(y_perm,x) > corr_init) count <- count+1
  }
p <- count/N #final p

但是我已经一步一步地完成了,但是我仍然需要提取每一列并进行测试...

我想知道是否有更好的方法来计算两个数据之间的所有相关性,获得R值,p值和P,并根据特定的随机数进行调整?

1 个答案:

答案 0 :(得分:0)

可以使用Hmisc软件包完成

library(Hmisc)

df1_cor_matrix <- rcorr(as.matrix(df1), type = "pearson")
df2_cor_matrix <- rcorr(as.matrix(df2), type = "pearson")

然后您可以使用以下方法提取系数:

df1_coef <- df1_cor_matrix$r
df2_coef <- df2_cor_matrix$r

您可以使用以下方法提取p值:

df1_p_values <- df1_cor_matrix$P
df2_p_values <- df2_cor_matrix$P

您可以使用rcorr.adjust函数获得调整后的p值:

rcorr.adjust(df1_cor_matrix, type = "pearson")
rcorr.adjust(df2_cor_matrix, type = "pearson")