Question

我有一个数据集，其中一列用不同的值复制，而其他列中的值为 0 或 1。

df

          B151452 B160128 B160363     Cytoband
A1BG            1       0       0            A
A1BG-AS1        1       0       0            A
AURKC           0       1       0            B
C19orf18        0       0       0            B
CENPBD1P1       0       1       0            B
CHMP2A          0       1       1            B

我想合并列 Cytoband 中具有相同值的行。在其他列中，只要 1 在同一个 Cytoband 中出现一次，就会一直保留。

          B151452 B160128 B160363     Cytoband
1               1       0       0            A
2               0       1       1            B

Answer 1

我们可以 aggregate 与 max 为 FUN

aggregate(.~ Cytoband, df, max)

-输出

#   Cytoband B151452 B160128 B160363
#1        A       1       0       0
#2        B       0       1       1

数据

df <- structure(list(B151452 = c(1L, 1L, 0L, 0L, 0L, 0L), B160128 = c(0L, 
0L, 1L, 0L, 1L, 1L), B160363 = c(0L, 0L, 0L, 0L, 0L, 1L), Cytoband = c("A", 
"A", "B", "B", "B", "B")), class = "data.frame", row.names = c("A1BG", 
"A1BG-AS1", "AURKC", "C19orf18", "CENPBD1P1", "CHMP2A"))

Answer 2

您可以检查列中每个 any 的 Cytoband 值是否为 1。

library(dplyr)
df %>% group_by(Cytoband) %>% summarise(across(.fns = ~+(any(.))))

# Cytoband B151452 B160128 B160363
#  <chr>      <int>   <int>   <int>
#1 A              1       0       0
#2 B              0       1       1

合并列具有相同值但其他列在 R 中不同的行

2 个答案:

数据