NA

时间:2017-10-12 05:53:30

标签: r dplyr lubridate

以下数据包含两个包含多个观察结果的一般群组,其中一些观察结果在NA字段中为DLADLA日期对于组内的所有记录都是相同的。如何将DLA值展开为'填写'具有相应日期的NA值。我在dplyr内工作,我怀疑它有一个我无法找到的解决方案。这些数据是具有~5k行和~500个个体的较大数据集的一小部分。非常感谢。

dat <- structure(list(GenIndID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("BHS_106", 
"BHS_164"), class = "factor"), IndID = structure(c(1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 3L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 7L, 
8L), .Label = c("BHS_106_A", "BHS_106_B", "BHS_106_C", "BHS_106_D", 
"BHS_164_A", "BHS_164_B", "BHS_164_C", "BHS_164_D"), class = "factor"), 
    DLA = structure(c(1507010400, 1507010400, 1507010400, 1507010400, 
    1507010400, 1507010400, 1507010400, 1507010400, NA, NA, 1499061600, 
    1499061600, 1499061600, 1499061600, 1499061600, 1499061600, 
    1499061600, NA, NA, NA), tzone = "", class = c("POSIXct", 
    "POSIXt"))), .Names = c("GenIndID", "IndID", "DLA"), row.names = c(411L, 
412L, 413L, 414L, 415L, 416L, 417L, 418L, 419L, 420L, 442L, 443L, 
444L, 445L, 446L, 447L, 448L, 449L, 450L, 451L), class = "data.frame")

> dat
    GenIndID     IndID        DLA
411  BHS_106 BHS_106_A 2017-10-03
412  BHS_106 BHS_106_A 2017-10-03
413  BHS_106 BHS_106_A 2017-10-03
414  BHS_106 BHS_106_A 2017-10-03
415  BHS_106 BHS_106_B 2017-10-03
416  BHS_106 BHS_106_B 2017-10-03
417  BHS_106 BHS_106_B 2017-10-03
418  BHS_106 BHS_106_B 2017-10-03
419  BHS_106 BHS_106_C       <NA>
420  BHS_106 BHS_106_D       <NA>
442  BHS_164 BHS_164_A 2017-07-03
443  BHS_164 BHS_164_A 2017-07-03
444  BHS_164 BHS_164_A 2017-07-03
445  BHS_164 BHS_164_A 2017-07-03
446  BHS_164 BHS_164_A 2017-07-03
447  BHS_164 BHS_164_A 2017-07-03
448  BHS_164 BHS_164_A 2017-07-03
449  BHS_164 BHS_164_B       <NA>
450  BHS_164 BHS_164_C       <NA>
451  BHS_164 BHS_164_D       <NA>

1 个答案:

答案 0 :(得分:0)

我们需要在{GenIndID'分组后fill。由于NA位于底部,默认为.direction = 'down'。所以,我们不需要指定它

dat %>%
  group_by(GenIndID) %>% 
  fill(DLA)