基于日期比较R的子集数据集

时间:2015-07-27 23:40:24

标签: r date subset

我有一个如下所示的数据集

    Col1      Col2       Col3        CutoffDate
    12001     Yes        2008-08-15  2008-08-10
    12001     Yes        2008-08-22  2008-08-10
    12001     Yes        2008-08-10  2008-08-10
    12001     Yes        2008-08-04  2008-08-10

我只对保留最后两行感兴趣,因为它们小于或等于截止日期 2008-08-10

最终数据集应如下所示

    Col1      Col2       Col3        CutoffDate
    12001     Yes        2008-08-10  2008-08-10
    12001     Yes        2008-08-04  2008-08-10

我知道R中的子集功能,但不知道如何做到这一点,非常感谢任何帮助。

3 个答案:

答案 0 :(得分:7)

您可以使用常规比较

dat[dat$Col3 <= dat$CutoffDate, ]
#    Col1 Col2       Col3 CutoffDate
# 3 12001  Yes 2008-08-10 2008-08-10
# 4 12001  Yes 2008-08-04 2008-08-10

假设Col3和CuttoffDate是“Date”类

或者最好是

with(dat, dat[Col3 <= CutoffDate, ])

答案 1 :(得分:3)

您可以使用subset()

df <- data.frame(Col1=c(12001,12001,12001,12001),Col2=c('Yes','Yes','Yes','Yes'),Col3=as.Date(c('2008-08-15','2008-08-22','2008-08-10','2008-08-04')),CutoffDate=as.Date(c('2008-08-10','2008-08-10','2008-08-10','2008-08-10')));
subset(df,Col3<=CutoffDate);
##    Col1 Col2       Col3 CutoffDate
## 3 12001  Yes 2008-08-10 2008-08-10
## 4 12001  Yes 2008-08-04 2008-08-10

答案 2 :(得分:1)

如果您使用的是dplyr:

library(dplyr)
df <- data.frame(Col1 = c(12001, 12001, 12001, 12001),
                 Col2 = c("Yes", "Yes", "Yes", "Yes"),
                 Col3 = as.Date(c("2008-08-15", "2008-08-22", "2008-08-10", "2008-08-04")),
                 CutoffDate = as.Date(c("2008-08-10", "2008-08-10", "2008-08-10", "2008-08-10")))

df %>% filter(Col3 <= CutoffDate)