Question

我有两个矩阵，一个用于snp基因型，另一个用于cnv基因型，我想找到所有snp-cnv对之间的相关性。例如，对于文件1中的每一行，找到文件2中所有行的相关性。

文件如下所示：

文件1：

snp1    2   0   2
snp2    2   1   2
snp3    2   1   2

文件2：

cnv1    2   1   2
cnv2    2   2   2
cnv3    2   2   1

到目前为止我正在使用R.这就是我一直在努力的方法：

snps <- read.table("snpsresults.txt", header=T, sep ="\t")
cnvs <- read.table("cnvs_genotypes2-9.txt", header=T, sep = "\t")
attach(snps)
attach(cnvs)

A <- as.matrix(t(snps))
B <- as.matrix(t(cnvs))

corr.matrix <- cor(A,B,use="pairwise.complete.obs", method="pearson")
write.table(corr.matrix, file="rs-values.txt"))

预期输出将是每个snp的所有r值的表格。

snp1 -0.5 snp2 -0.5

感谢您的帮助！

Answer 1

目前还不是很清楚你在问什么，所以如果你想编辑和澄清你的问题，你会得到更好的帮助。我假设你想按行计算snp和cnv的相关性（列是标准的）。原则上你只需要：

cor(t(snp), t(cnv))

这将导致您的数据出错，因为cnv2是常量，并且首先计算您贬低（ok）和标准化（此处不可能标准差为零）的相关性。所以我在这里略微改变了你的示例数据：

snp = matrix(c(2,2,2, 0,1,1, 2,2,2), 3, 3)
cnv = matrix(c(2,3,2, 1,2,2, 2,2,1), 3, 3)

现在你得到：

> snp
     [,1] [,2] [,3]
[1,]    2    0    2
[2,]    2    1    2
[3,]    2    1    2
> cnv
     [,1] [,2] [,3]
[1,]    2    1    2
[2,]    3    2    2
[3,]    2    2    1
> cor(t(snp), t(cnv))
     [,1] [,2] [,3]
[1,]    1  0.5 -0.5
[2,]    1  0.5 -0.5
[3,]    1  0.5 -0.5

请注意，这是每行不变的，因为在贬值和标准化snp行之后都是相同的。

找到2个文件中的所有对之间的相关性R.

1 个答案: