按日期和ID组创建索引列

时间:2020-04-07 04:49:10

标签: r indexing group-by dplyr

R / Stackoverflow新手在这里。

我有一个数据集,看起来像:

    year mother_id   nest_seq incubation_date
   <int> <chr>          <int> <date>         
 1  1994 543762-MMM         1 1994-10-16     
 2  1994 543762-MMM         3 1994-11-06     
 3  1994 543762-MMM         4 1994-11-24     
 4  1994 543762-MMM         6 1994-12-05     
 5  1995 583809-mGMW        4 1994-10-24     
 6  1995 583809-mGMW        7 1994-11-21     
 7  1995 583809-mGMW        8 1994-12-22     
 8  1996 596751-BWM         1 1994-11-20     
 9  1996 596751-BWM         2 1994-12-23     
10  1996 626691-GBW         2 1994-11-08

我只是想基于nest_seq生成一个新的incubation_date,例如:

 year mother_id   nest_seq incubation_date new_nest_seq
   <int> <chr>          <int> <date>         <int>
 1  1994 543762-MMM         1 1994-10-16        1
 2  1994 543762-MMM         3 1994-11-06        2
 3  1994 543762-MMM         4 1994-11-24        3
 4  1994 543762-MMM         6 1994-12-05        4

我一直在尝试使用if_else()来执行此操作,但是却陷入困境...

group_by(year, mother_id) %>%
mutate(new_nest_seq = if_else(min(incubation_date), 1, ?)))

将非常感谢您的任何建议...

1 个答案:

答案 0 :(得分:0)

尝试一下:

library(dplyr)

df %>%
  # Just in case: order 
  arrange(year, mother_id, incubation_date) %>% 
  group_by(year, mother_id) %>%
  # Create new index
  mutate(
    new_nest_seq = 1, 
    new_nest_seq = cumsum(new_nest_seq)) %>% 
  ungroup()
#> # A tibble: 10 x 5
#>     year mother_id   nest_seq incubation_date new_nest_seq
#>    <int> <fct>          <int> <fct>                  <dbl>
#>  1  1994 543762-MMM         1 1994-10-16                 1
#>  2  1994 543762-MMM         3 1994-11-06                 2
#>  3  1994 543762-MMM         4 1994-11-24                 3
#>  4  1994 543762-MMM         6 1994-12-05                 4
#>  5  1995 583809-mGMW        4 1994-10-24                 1
#>  6  1995 583809-mGMW        7 1994-11-21                 2
#>  7  1995 583809-mGMW        8 1994-12-22                 3
#>  8  1996 596751-BWM         1 1994-11-20                 1
#>  9  1996 596751-BWM         2 1994-12-23                 2
#> 10  1996 626691-GBW         2 1994-11-08                 1

或通过dplyr::row_number

df %>%
  # Just in case: order 
  arrange(year, mother_id, incubation_date) %>% 
  group_by(year, mother_id) %>%
  # Create new index
  mutate(new_nest_seq = dplyr::row_number()) %>% 
  ungroup()

reprex package(v0.3.0)于2020-04-07创建