如何在日期范围上拆分数据

时间:2014-04-02 12:09:51

标签: r date range subset

我在data.frame中有两个日期列,称为fault。它还有很多其他专栏。我们的想法是提取记录,其中第二列日期在第一列的10天之间,起点应该在第3天...... 我想提取日期2在日期1的10天之间但从日期1的第3天开始的那些列。

这就是我做的......

for (i in 1:length(faults$PERIOD_START)){

  if (faults$FAULT_RECEIVED_DATE_FIRST[i] > faults$PERIOD_START[i])
  {
    if(faults$FAULT_RECEIVED_DATE_FIRST[i] == faults$PERIOD_START[i]+i){
      brat_none_set_b4_7d_view_flt_rec[i] = faults[i]
    }
  }
}

显然,这不会在3到10天之间提取数据......

示例日期是:

faults$FAULT_RECEIVED_DATE_FIRST = 
   "2013-12-01" , "2013-12-01", "2013-12-01" "2013-12-02", "2013-12-03", 

faults$PERIOD_START = 
   "2013-11-01", "2013-11-25", "2013-11-24", "2014-11-23", "2013-11-20"

应提取的预期记录是:

在2013-11-25,2013-11-24,2013-11-23的指数(因为它是在收到故障的10天到第3天之间是2013-11-27)

任何想法如何实现这些人?

此致

2 个答案:

答案 0 :(得分:0)

您可以尝试:

x <- which((faults$FAULT_RECEIVED_DATE_FIRST - faults$PERIOD_START) >= 3 &
           (faults$FAULT_RECEIVED_DATE_FIRST - faults$PERIOD_START) <= 10)
faults[x]

答案 1 :(得分:0)

faults <- as.data.frame(matrix (nrow = 5,ncol = 2))
colnames (faults) <- c ("PERIOD_START", "FAULT_RECEIVED_DATE_FIRST")

faults$FAULT_RECEIVED_DATE_FIRST  <-  c("2013-12-01" , "2013-12-01", 
                                        "2013-12-01", "2013-12-02", "2013-12-03")
faults$PERIOD_START  <-  c ("2013-11-01", "2013-11-25", "2013-11-24", 
                            "2013-11-23", "2013-11-20")

将角色向量转换为日期:

faults$FAULT_RECEIVED_DATE_FIRST <- as.Date (faults$FAULT_RECEIVED_DATE_FIRST, 
                                             format = "%Y-%m-%d")
faults$PERIOD_START <- as.Date (faults$PERIOD_START, format = "%Y-%m-%d")

比你只是休息日期以获得时差:

faults ["diff"] <- faults ["FAULT_RECEIVED_DATE_FIRST"] - faults ["PERIOD_START"]

并将其转换为数字:

faults ["diff_days"] <- as.numeric(faults [["diff"]])

因此,您可以使用所需的条目对数据进行分组:

faults [faults$diff_days >= 3 & faults$diff_days =< 10,]