多天的diel周期子集(来自不同df的多列子集)

时间:2017-08-03 02:29:27

标签: r subset

我正在尝试将遥测数据分组到diel时段。遥测数据跨越7个月,我希望这些子集能够使用每日日出数据。

我们说df1是我的遥测数据,df2是我的日出数据:

df1 <- data.frame(
  datetime = as.POSIXct(c("2016-05-01 04:30", "2016-05-01 07:00", "2016-05-01 13:50", 
                   "2016-05-03 03:50", "2016-05-04 18:20", "2016-05-06-04:20")),
  ID = c("A1", "B3", "A2", "A2", "B1", "B2")
)


df2 <- data.frame(
  date = as.POSIXct(c("2016-05-01", "2016-05-02", "2016-05-03", "2016-05-04", "2016-05-05", "2016-05-06")),
  ntwilight.start = c("03:25:00", "03:23:00", "03:21:00", "03:19:00", "03:17:00", "03:15:00"),
  sunrise = c("04:45:00", "04:44:00", "04:42:00", "04:40:00", "04:39:00", "04:37:00")
)
df2$ntwilight.start <- as.POSIXct(paste(df2$date, df2$ntwilight.start, sep = " "), format = "%Y-%m-%d %H:%M")
df2$sunrise <- as.POSIXct(paste(df2$date, df2$sunrise, sep = " "), format = "%Y-%m-%d %H:%M")

为了创建黎明的子集,我需要选择df1datetime落在ntwilight.startsunrise之间的所有行df2。该子集应如下所示:

             datetime ID
1 2016-05-01 04:30:00 A1
2 2016-05-03 03:50:00 A2
3 2016-05-06 04:20:00 B2

我可以使用一对时间值

来对df1进行分组
dawn <- df1[df1$datetime >= as.POSIXct("2016-05-01 03:25", format = "%Y-%m-%d %H:%M") & df1$datetime < as.POSIXct("2016-05-01 04:45", format = "%Y-%m-%d %H:%M")]

但是,以下代码无法提供正确的匹配:

dawn2 <- df1[df1$datetime >= df2$ntwilight.start & df1$datetime < df2$sunset,]

如何针对匹配日期的行R搜索df2,并使用df2中的相应行来确定子集?

我觉得我可能需要将日期和时间分成不同的列(对于两个数据框),并且可能需要按日期对df1进行分组,并分别对每个组进行分组。

1 个答案:

答案 0 :(得分:0)

我使用了一些dplyr魔法按日期加入您的数据框,然后我按

过滤了结果数据框
df1 <- data.frame(
datetime = as.POSIXct(c("2016-05-01 04:30", "2016-05-01 07:00", "2016-05-01 13:50", 
                        "2016-05-03 03:50", "2016-05-04 18:20", "2016-05-06 04:20"),
                      , format = "%Y-%m-%d %H:%M"),
ID = c("A1", "B3", "A2", "A2", "B1", "B2")
)

df1$date <- as.POSIXct(as.character(df1$datetime), format = "%Y-%m-%d") #there is probably a better way to isolate date, but this works...


df2 <- data.frame(
date = as.POSIXct(c("2016-05-01", "2016-05-02", "2016-05-03", "2016-05-04", "2016-05-05", "2016-05-06")),
ntwilight.start = c("03:25:00", "03:23:00", "03:21:00", "03:19:00", "03:17:00", "03:15:00"),
sunrise = c("04:45:00", "04:44:00", "04:42:00", "04:40:00", "04:39:00", "04:37:00")
)

df2$ntwilight.start <- as.POSIXct(paste(df2$date, df2$ntwilight.start, sep = " "), format = "%Y-%m-%d %H:%M")
df2$sunrise <- as.POSIXct(paste(df2$date, df2$sunrise, sep = " "), format = "%Y-%m-%d %H:%M")



library(dplyr)
dawn2 <- df1 %>% 
    left_join(df2,by = "date") %>%                            # join the 2 data frames by date
    filter(datetime >= ntwilight.start, datetime < sunrise)   # filter datetime before twilight and after sunrise