如何创建具有特定数学方程的矩阵?

时间:2017-03-10 02:17:13

标签: r math matrix sparse-matrix matrix-multiplication

我有庞大的数据集。该数据由371个基因型(以gwas开头)和105000个标记组成。我需要在R中的基因型中使用105000标记具有特定数学方程的矩阵。数据格式如下

markers  gwas_100   gwas_101    gwas_102    gwas_103
S1_147748   NA  NA  NA  NA
S1_239131   0.67385 0.67385 0.67385 0.67385
S1_644966   0.61051 0.61051 0.61051 0.61051
S1_1625764  NA  0.71429 NA  0.71429
S1_1761929  0.69137 0.69137 0.69137 0.69137
S1_1778021  0.72372 0.72372 0.72372 0.72372
S1_1778059  0.72507 0.72507 0.72507 0.72507
S1_1778136  0.68733 0.68733 0.68733 0.68733
S1_1778289  0.69946 0.69946 0.69946 0.69946
S1_1780669  0.73046 0.73046 0.73046 0.73046
S1_1786636  0.71563 0.71563 0.71563 0.71563
S1_1786639  0.71833 0.71833 0.71833 0.71833
S1_1786640  0.71294 0.71294 0.71294 0.71294
S1_1786678  0.71429 0.71429 0.71429 0.71429
S1_1963487  0.72776 0.72776 0.72776 0.72776
S1_2036329  0.74259 0.74259 0.74259 0.74259
S1_2036386  0.74394 0.74394 0.74394 0.74394
S1_2037735  0.7628  0.7628  0.7628  0.7628
S1_2037760  0.7628  0.7628  0.7628  0.7628
S1_2037773  0.7628  0.7628  0.7628  0.7628
S1_2042132  0.58491 NA  NA  NA

数学方程式

(gwas_100 & gwas_101) = Sum (gwas100) - sum (gwas_101), where
sum gwas_100 = 0.67385 + 0.61051 + 0.69137.....+0.58491)
sum gwas_101 = 0.67385 + 0.61051+ ....... 0.7228), therefore
(gwas_100 & gwas_101) = 13.4905 - 13.61994 = -0.12938

然后我需要在每两个之间获得矩阵,并且需要371个基因型的所有可能组合 像一个例子

      gwas_100 gwas101 gwas_102 gwas_103
gwas_100          -0.12   0.14      0.05
gwas_101                   0.06     0.1 
gwwas_102                           0.07
gwas_103

提前致谢

1 个答案:

答案 0 :(得分:1)

您可以先使用colSums汇总忽略NA的列,然后使用outer按对将它们减去:

sums <- colSums(data[-1], na.rm=TRUE)
outer(sums,sums,`-`)
         gwas_100 gwas_101 gwas_102 gwas_103
gwas_100  0.00000 -0.12938  0.58491 -0.12938
gwas_101  0.12938  0.00000  0.71429  0.00000
gwas_102 -0.58491 -0.71429  0.00000 -0.71429
gwas_103  0.12938  0.00000  0.71429  0.00000