我有一个带有四个可能的类S4:interval的数据框,如下所示:
.col-expand {
position: absolute;
background-color: orange;
width: 100%;
right: 0px;
}
我需要的是对可以使用int_start()提取的最早开始日期进行排序,然后将该间隔(或长度)存储为例如first_int作为数据集中的新变量,并重申第二,第三和第四。
预期输出为:
id int_a int_b int_c int_d
1 2013--2015 2011--2012 NA--NA 2014--2014
我在下面添加了我的数据集的一部分
id .. first_int sec_int third_int fourth_int
1 .. 2011--2012 2013--2015 2014--2014 NA--NA
由reprex package(v0.3.0)于2019-05-22创建
非常感谢您!
答案 0 :(得分:0)
我不确定您在这里真正寻找的是什么,但是我认为这可以解决您的挑战。我对您的帖子中提到
的部分感到有些困惑我可以按开始日期进行排序,但是我无法对间隔的一个特征进行排序,然后将其提取出来。
使用purr::
和lubridate::
,您可以执行以下操作:
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
library(purrr)
so <- structure(
list(
int_a = new("Interval",
.Data = c(24192000, 52704000, 0, 64022400, NA, NA, NA, 0, NA, NA),
start = structure(c(1286841600, 1327276800, 1157068800, 1370995200, NA, NA, NA, 1296172800, NA, NA),
class = c("POSIXct", "POSIXt"),
tzone = "UTC"), tzone = "UTC"),
int_b = new("Interval",
.Data = c(NA, 2505600, NA, NA, 53222400, 7862400, NA, NA, 0, 116812800),
start = structure(c(NA, 1402531200, NA, NA, 1397433600, 1307577600, NA, NA, 1366329600, 1320278400),
class = c("POSIXct", "POSIXt"),
tzone = "UTC"), tzone = "UTC"),
int_c = new("Interval",
.Data = c(NA, NA, 19353600, NA, NA, 41472000, 0, NA, NA, NA),
start = structure(c(NA, NA, 1287446400, NA, NA, 1238025600, 1433203200, NA, NA, NA),
class = c("POSIXct", "POSIXt"),
tzone = "UTC"),tzone = "UTC"),
int_d = new("Interval",
.Data = c(3024000, 9331200, NA, 0, 8899200, 36374400, 0, 3196800, 18748800, 28771200),
start = structure(c(1316044800, 1396828800, NA, 1466640000, 1457568000, 1290038400, 1444694400, 1321315200, 1381968000, 1438300800),
class = c("POSIXct", "POSIXt"),
tzone = "UTC"), tzone = "UTC")),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L))
earliest_start_interval_list <- map_at(.x = so, .at = 1:ncol(so), ~ min(lubridate::int_start(.x), na.rm = TRUE))
由reprex package(v0.3.0)于2019-05-22创建