原始答案

Question

我确信这个问题的答案已经到了某个地方，但我并不认为我一直在使用正确的搜索字词。

这是我的问题。我有多个矩阵（我将在这里简化为两个），其中每一行都是一个唯一标记的个体（其中一些在矩阵之间共享，其中一些不是），以及共享的常见列标题。

例如：

first<-matrix(rbinom(20,1,.5),4,5)
first[,1]=c(122,145,186,199)
colnames(first)<-c("ID",901,902,903,904)
first
      ID 901 902 903 904
[1,] 122   1   0   0   0
[2,] 145   0   0   0   1
[3,] 186   0   0   1   1
[4,] 199   1   0   0   0

second<-matrix(rbinom(30,1,.5),6,5)
second[,1]=c(122,133,142,151,186,199)
colnames(second)<-c("ID",901,902,903,904)
second
      ID 901 902 903 904
[1,] 122   0   1   1   1
[2,] 133   0   0   0   1
[3,] 142   1   1   0   1
[4,] 151   0   1   0   0
[5,] 186   1   0   1   1
[6,] 199   1   0   0   0

我想首先添加＆＃39;和第二个＆＃39;一起根据“ID＆＃39;和列名称。这应该会产生一个包含7行的矩阵（因为在＆＃39;第一个＆＃39;矩阵中有4个ID，在＆＃39;第二个＆＃39;矩阵中有3个新ID和3个旧ID：＆＃34; 122,133,142,145,151,186,199＆＃34;）和相同数量的列。

在这个例子中，我想要的结果是：

      ID 901 902 903 904
[1,] 122   1   1   1   1
[2,] 133   0   0   0   1
[3,] 142   1   1   0   1
[4,] 145   0   0   0   1
[5,] 151   0   1   0   0
[6,] 186   1   0   2   2
[7,] 199   2   0   0   0

Answer 1

原始答案

基于@ryogi的方法，你使用rownames和colnames来描述你的矩阵，我提出以下建议：

res <- rbind(first,second)
res <- tapply(res, expand.grid(dimnames(res)), sum)

所有具有相同rownames的行将相加。

使用数据框时

如果您的输入是data.frame，则上述操作无效，因为data.frame不得包含任何重复的行名称。另一种方法也适用于此：

rowsum(rbind(first, second), c(rownames(first), rownames(second)))

这种方法也适用于矩阵。因为它只需要一行，您可能会认为它更简单。我想它也可能更有效率，因为它不如tapply。您可以根据您的问题将此解决方案调整为数据格式，其中标识符位于单独的列中：

rowsum(rbind(first, second)[,-1], c(first[,1], second[,1]))

请注意，结果仍然会有命名行，而不是包含这些名称的列。

有趣的是，我在这个问题的rowsum版本的相对复杂的方法中寻找rowSums时，意外地阅读了data.frame。幸运的我。

其他提示

如果您发现尺寸混淆的结果名称Var1和Var2，则可以使用

删除它们

names(dimnames(res)) <- NULL

如果您的数据确实采用您描述的格式，并且第一个数据列中包含行名称，则可以使用以下命令将它们更改为正确的行名称：

rownames(first) <- first[,1]
first <- first[,-1]

Answer 2

我的问题略有不同：

first <- matrix(rbinom(16,1,.5),4,4)
rownames(first) <- c(122,145,186,199)
colnames(first) <- c(901,902,903,904)

second <- matrix(rbinom(24,1,.5),6,4)
rownames(second) <- c(122,133,142,151,186,199)
colnames(second) <- c(901,902,903,904)

矩阵现在命名为rownames

> first
    901 902 903 904
122   1   0   0   1
145   1   0   0   0
186   0   0   1   1
199   1   0   1   1
> second
    901 902 903 904
122   1   1   0   0
133   0   0   1   1
142   1   0   1   0
151   1   0   1   1
186   0   1   0   1
199   0   0   0   0

现在可以很容易地对行名称进行设置操作：

SumOnID <- function(A, B){
  rnA <- rownames(A)
  rnB <- rownames(B)

  ls.id <- list(ids = intersect(rnA, rnB), #shared indices
                idA = setdiff(rnA, rnB),   #only in A
                idB = setdiff(rnB, rnA))   #only in B

  do.call(rbind, 
    lapply(names(ls.id), function(x){
      if (x == "ids") return(A[x,, drop = F] + B[x,, drop = F])
      if (x == "idA") return(A[x,, drop = F])
      if (x == "idB") return(B[x,, drop = F])
    }))
}

让我们尝试一下：

> SumOnID(first, second)
    901 902 903 904
122   2   1   1   1
186   1   1   0   1
199   2   1   1   0
145   1   1   0   1
133   1   0   1   1
142   1   0   1   0
151   1   1   1   1

Answer 3

我一直在寻找没有使用内置函数的“for”循环的解决方案，但没有成功。所以这是我的方法

set.seed(1) # make it reproducible
first <- matrix(rbinom(20,1,.5),4,5)
first[ ,1] <- c(122, 145, 186, 199)
colnames(first) <- c("ID", 901, 902, 903, 904)

second <- matrix(rbinom(30, 1, .5), 6, 5)
second[ ,1] <- c(122, 133, 142, 151, 186, 199)
colnames(second) <- c("ID", 901, 902, 903, 904)

first

      ID 901 902 903 904
[1,] 122   0   1   1   1
[2,] 145   1   0   0   1
[3,] 186   1   0   1   0
[4,] 199   1   0   0   1

second
      ID 901 902 903 904
[1,] 122   0   0   1   1
[2,] 133   0   0   0   1
[3,] 142   1   1   1   0
[4,] 151   0   1   1   0
[5,] 186   0   1   1   1
[6,] 199   1   0   1   1

## stack them rowise
mat <- rbind(first, second)

ind <- unique(mat[,"ID"])

result <- matrix(nrow = length(ind), ncol = 5)
result[,1] <- ind

for (i in seq_along(ind)) {
    result[i,-1] <- colSums(mat[mat[ ,"ID"] == ind[i], -1, drop = FALSE])
}
colnames(result) <- colnames(mat)

result
      ID 901 902 903 904
[1,] 122   0   1   2   2
[2,] 145   1   0   0   1
[3,] 186   1   1   2   1
[4,] 199   2   0   1   2
[5,] 133   0   0   0   1
[6,] 142   1   1   1   0
[7,] 151   0   1   1   0

根据行和列指定添加矩阵

3 个答案:

原始答案

使用数据框时

其他提示