通过唯一ID压缩数据集

时间:2017-09-12 20:47:37

标签: r row unique

我想要合并重复ID的行。列是二进制文件,所以我想将它们加在一起

之前的例子:

 id      nam1      nam2
  1      1         1
  1      0         0
  2      1         0
  2      0         1
  3      1         1
  3      1         0

之后的例子:

id      nam1      nam2
 1      1         1
 2      1         1
 3      2         1

关于如何做到这一点的任何想法?

2 个答案:

答案 0 :(得分:4)

@ d.b回答评论:

aggregate(.~id, df, sum)

或使用dplyr

library(dplyr)

df %>%
  group_by(id) %>%
  summarize_all("sum")

<强>结果:

# A tibble: 3 x 3
     id  nam1  nam2
  <int> <int> <int>
1     1     1     1
2     2     1     1
3     3     2     1

数据

df = structure(list(id = c(1L, 1L, 2L, 2L, 3L, 3L), nam1 = c(1L, 0L, 
1L, 0L, 1L, 1L), nam2 = c(1L, 0L, 0L, 1L, 1L, 0L)), .Names = c("id", 
"nam1", "nam2"), row.names = c(NA, -6L), class = "data.frame")

答案 1 :(得分:2)

#Sample data:
df <- data.frame(id=c(1,1,2,2,3,3),
                 nam1=c(1,0,1,0,1,1), 
                 nam2=c(1,0,0,1,1,0))

library(data.table)
setDT(df)[, lapply(.SD, sum), by=.(id)]

id nam1 nam2
1    1    1
2    1    1
3    2    1