查找最早的时间间隔开始日期并提取时间间隔(R,lubridate)

时间:2019-05-22 13:29:48

标签: r intervals lubridate

我有一个带有四个可能的类S4:interval的数据框,如下所示:

.col-expand {
    position: absolute;
    background-color: orange;
    width: 100%;
    right: 0px;
}

我需要的是对可以使用int_start()提取的最早开始日期进行排序,然后将该间隔(或长度)存储为例如first_int作为数据集中的新变量,并重申第二,第三和第四。

预期输出为:

 id  int_a           int_b             int_c              int_d
 1   2013--2015      2011--2012        NA--NA             2014--2014

我在下面添加了我的数据集的一部分

id  .. first_int      sec_int          third_int          fourth_int
 1  .. 2011--2012     2013--2015       2014--2014         NA--NA

reprex package(v0.3.0)于2019-05-22创建

非常感谢您!

1 个答案:

答案 0 :(得分:0)

我不确定您在这里真正寻找的是什么,但是我认为这可以解决您的挑战。我对您的帖子中提到

的部分感到有些困惑
  

我可以按开始日期进行排序,但是我无法对间隔的一个特征进行排序,然后将其提取出来。

使用purr::lubridate::,您可以执行以下操作:

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
library(purrr)

so <- structure(
  list(
    int_a = new("Interval",
                .Data = c(24192000, 52704000, 0, 64022400, NA, NA, NA, 0, NA, NA),
                start = structure(c(1286841600, 1327276800, 1157068800, 1370995200, NA, NA, NA, 1296172800, NA, NA), 
                                  class = c("POSIXct", "POSIXt"), 
                                  tzone = "UTC"), tzone = "UTC"),

    int_b = new("Interval", 
                .Data = c(NA, 2505600, NA, NA, 53222400, 7862400, NA, NA, 0, 116812800), 
                start = structure(c(NA, 1402531200, NA, NA, 1397433600, 1307577600, NA, NA, 1366329600, 1320278400),
                                  class = c("POSIXct", "POSIXt"),
                                  tzone = "UTC"), tzone = "UTC"),

    int_c = new("Interval",
                .Data = c(NA, NA, 19353600, NA, NA, 41472000, 0, NA, NA, NA),
                start = structure(c(NA, NA, 1287446400, NA, NA, 1238025600, 1433203200, NA, NA, NA), 
                                  class = c("POSIXct", "POSIXt"), 
                                  tzone = "UTC"),tzone = "UTC"),

    int_d = new("Interval",
                .Data = c(3024000, 9331200, NA, 0, 8899200, 36374400, 0, 3196800, 18748800, 28771200), 
                start = structure(c(1316044800, 1396828800, NA, 1466640000, 1457568000, 1290038400, 1444694400, 1321315200, 1381968000, 1438300800), 
                                  class = c("POSIXct", "POSIXt"), 
                                  tzone = "UTC"), tzone = "UTC")),

  class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L))

earliest_start_interval_list <- map_at(.x = so, .at = 1:ncol(so), ~ min(lubridate::int_start(.x), na.rm = TRUE))

reprex package(v0.3.0)于2019-05-22创建