根据r中的某些条件,使用dplyr添加特定的新行

时间:2018-12-21 23:09:02

标签: r dplyr

我的df如下,我想基于IDsemester_num添加新行。到目前为止,使用dplyr将是:

df %>%
 group_by(ID) %>%
 group_by(semster_num) %>%
 #add new row here  

我希望新行具有与上一行相似的所有记录,除了第三列值(subject_result2)应该与列的第4列(Success)相同前一行。

tibble::tribble(
      ~ID, ~semester_num,   ~subject_result2,    ~Success,
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  200000L,             1, "OTHERPassedTerm2", "fail",
  200000L,             1, "MATH1PassedTerm2", "fail",
  200000L,             2, "MATH1PassedTerm2", "fail",
  200000L,             2, "OTHERPassedTerm2", "fail"
  )

结果:(我表示新添加的行)

          ~ID, ~semester_num,   ~subject_result2,    ~Success,
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             1, "Grad_ENSC",        "Grad_ENSC",
      100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             2, "Grad_ENSC",        "Grad_ENSC",
      200000L,             1, "OTHERPassedTerm2", "Grad_ENSC",
      200000L,             1, "MATH1PassedTerm2", "fail",
 >>   200000L,             1, "Fail",             "fail",
      200000L,             2, "MATH1PassedTerm2", "fail",
      200000L,             2, "OTHERPassedTerm2", "fail",
 >>   200000L,             2, "fail,              "fail

请帮助在r中实施它。 (也可以使用其他软件包也很好)

1 个答案:

答案 0 :(得分:2)

您可以通过将dotibble::add_row结合使用。我以以下问题的答案为基础得出这个答案:Add row in each group using dplyr and add_row(),特别是@JasonWang的评论

df %>%
    dplyr::group_by(ID, semester_num) %>%
    do(tibble::add_row(.,
                       ID = .$ID[1],
                       semester_num = .$semester_num[1],
                       subject_result2 = .$Success[nrow(.)], #Get the last row of the group
                       Success = .$Success[nrow(.)]))

# A tibble: 14 x 4
# Groups:   ID, semester_num [4]
       ID semester_num subject_result2  Success  
    <int>        <dbl> <chr>            <chr>    
 1 100000            1 OTHERPassedTerm1 Grad_ENSC
 2 100000            1 OTHERPassedTerm1 Grad_ENSC
 3 100000            1 OTHERPassedTerm1 Grad_ENSC
 4 100000            1 Grad_ENSC        Grad_ENSC
 5 100000            2 MATH1PassedTerm1 Grad_ENSC
 6 100000            2 OTHERPassedTerm1 Grad_ENSC
 7 100000            2 OTHERPassedTerm1 Grad_ENSC
 8 100000            2 Grad_ENSC        Grad_ENSC
 9 200000            1 OTHERPassedTerm2 fail     
10 200000            1 MATH1PassedTerm2 fail     
11 200000            1 fail             fail     
12 200000            2 MATH1PassedTerm2 fail     
13 200000            2 OTHERPassedTerm2 fail     
14 200000            2 fail             fail  

通常tibble::add_row不能与分组的数据帧一起使用,但是通过使用do,我们可以将其分别应用于每个组而无需离开管道。