我有矩阵文件,它基本上是跨各种细胞类型的基因之间的spearman相关矩阵。所以现在我试图找出哪些基因或基因组的相关值可以说大于0.6,如果我将其设置为我的阈值。我怎样才能做到这一点?我正在发布我的数据子集。它是一个502 x 502矩阵。
ACTL6B ACTR5 ACTR6
ACTL6B 1 0.6 -0.4
ACTR5 0.4 1 -0.3
ACTR6 -0.4 -0.3 1
所以我不希望相同的基因组之间的相关性是1.我想要另一个比较。比方说,ACTL6B
和ACTR5
的相关系数为0.6。我想保留这些价值观和基因。
答案 0 :(得分:3)
以下是一个例子:
mat <- cor(longley) # example 7 x 7 correlation matrix
# Find indices of correlations greater than 0.6
idx <- which(mat > 0.6 & lower.tri(mat), arr.ind = TRUE)
# names of the resulting variables
cbind(rownames(idx), colnames(mat)[idx[, 2]])
由于lower.tri
,对角线和上矩阵中的所有值都将被忽略。
结果:
[,1] [,2]
[1,] "GNP" "GNP.deflator"
[2,] "Unemployed" "GNP.deflator"
[3,] "Population" "GNP.deflator"
[4,] "Year" "GNP.deflator"
[5,] "Employed" "GNP.deflator"
[6,] "Unemployed" "GNP"
[7,] "Population" "GNP"
[8,] "Year" "GNP"
[9,] "Employed" "GNP"
[10,] "Population" "Unemployed"
[11,] "Year" "Unemployed"
[12,] "Year" "Population"
[13,] "Employed" "Population"
[14,] "Employed" "Year"