根据R中的标签对2维表的值求和

时间:2016-06-28 07:36:51

标签: r

来自Sum the values according to labels in R

我被告知,使用2维表格与1维表格相比有很大不同,例如:

     a    a,b    a,b,c    c
 d   5     2       1      2
d,e  2     1       1      1

我们希望实现:

     a    b    c
d    12   5    5
e    4    2    2

那么如何使用R来实现呢?

1 个答案:

答案 0 :(得分:1)

有点费解,但它应该有效:

m <- as.matrix(data.frame('a'=c(5,2),'a,b'=c(2,1),
                          'a,b,c'=c(1:1),'c'=c(2,1),
                          check.names = FALSE,row.names=c('d','d,e')))
colNamesSplits <- strsplit(colnames(m),',')
rowNamesSplits <- strsplit(rownames(m),',')

colNms <- unique(unlist(colNamesSplits))
rowNms <- unique(unlist(rowNamesSplits))

colIdxs <- unlist(sapply(1:length(colNamesSplits),
                         function(i) rep.int(i,length(colNamesSplits[[i]]))))
rowIdxs <- unlist(sapply(1:length(rowNamesSplits),
                         function(i) rep.int(i,length(rowNamesSplits[[i]]))))
colIdxsMapped <- unlist(sapply(colNamesSplits, function(n) match(n,colNms)))
rowIdxsMapped <- unlist(sapply(rowNamesSplits, function(n) match(n,rowNms)))

# let's create the fully expanded matrix
expanded <- as.matrix(m[rowIdxs,colIdxs])
rownames(expanded) <- rowNms[rowIdxsMapped]
colnames(expanded) <- colNms[colIdxsMapped]

# aggregate expanded by cols :
expanded <- do.call(cbind,lapply(split(1:ncol(expanded),colnames(expanded)),
                 function(ii) rowSums(expanded[,ii,drop=FALSE])))
# aggregate expanded by rows :
expanded <- do.call(rbind,lapply(split(1:nrow(expanded),rownames(expanded)),
                 function(ii) colSums(expanded[ii,,drop=FALSE])))

> expanded
   a b c
d 12 5 5
e  4 2 2