group_by()和n()的结果不同(paste0)

时间:2019-12-18 07:48:04

标签: r dplyr

我想将计数标签关联到ID。因此,我将我的数据分组,如果一组有多行,则添加计数标签。计数指标由dplyr::n()完成。如果n()paste0()之外使用,则我的代码有效,如果在tibble::tibble(Group = c("A","A","B","C"), ID = c("ID_1", "ID_2", "ID_10", "ID_20")) %>% dplyr::group_by(Group) %>% dplyr::mutate(n = dplyr::n(), tag = ifelse(n > 1, paste0(ID, " #", dplyr::row_number()), ID)) %>% dplyr::ungroup() A tibble: 4 x 4 Group ID n tag <chr> <chr> <int> <chr> 1 A ID_1 2 ID_1 #1 2 A ID_2 2 ID_2 #2 3 B ID_10 1 ID_10 4 C ID_20 1 ID_20 内部使用,则该代码无效。结果不同的原因是什么?

n()在paste0()之外->正确的结果

tibble::tibble(Group = c("A","A","B","C"),
               ID = c("ID_1", "ID_2", "ID_10", "ID_20")) %>% 
  dplyr::group_by(Group) %>% 
  dplyr::mutate(tag = ifelse(dplyr::n() > 1, 
                             paste0(ID, " #", dplyr::row_number()),
                             ID)) %>% 
  dplyr::ungroup()

A tibble: 4 x 3
  Group ID    tag    
  <chr> <chr> <chr>  
1 A     ID_1  ID_1 #1
2 A     ID_2  ID_1 #1
3 B     ID_10 ID_10  
4 C     ID_20 ID_20

n()在paste0()中->错误的结果(标记均为ID_1#1)

<td mat-cell *matCellDef="let element">

1 个答案:

答案 0 :(得分:1)

因为条件n() > 1的长度为1,而ifelse返回的矢量的长度与我们要检查的条件的长度相同。您可以在这里尝试if / else

tibble::tibble(Group = c("A","A","B","C"),
           ID = c("ID_1", "ID_2", "ID_10", "ID_20")) %>% 
    dplyr::group_by(Group) %>% 
    dplyr::mutate(tag = if(n() > 1) paste0(ID, " #", dplyr::row_number()) 
                  else ID) %>% 
    dplyr::ungroup()

# A tibble: 4 x 3
#  Group ID    tag    
#  <chr> <chr> <chr>  
#1 A     ID_1  ID_1 #1
#2 A     ID_2  ID_2 #2
#3 B     ID_10 ID_10  
#4 C     ID_20 ID_20  

在第一次尝试中,长度为2的第一组(n > 1)的条件为Group == A,而在第二种情况下,条件为n() > 1且长度仅为1的条件,因此只有1值(ID_1 #1)生成并循环到其他行。