R数据帧:日期范围操作 - 对属于特定日期范围的行进行子集

时间:2014-12-16 06:00:01

标签: r datetime dataframe subset date-range

如果您看到我的个人资料,我的所有问题都是关于数据框的,这是另一个问题!

我有一个特定的数据框,这是借记和贷记交易之间合并的结果

>head(allTxns)
   Cust_no    CreditDate   Credit     DebitDate   Debit
 1  12345     2014-10-01    200      2014-10-03    400
 2  12345     2014-10-01    200      2014-10-04    150
 3  12345     2014-10-01    200      2014-10-15    800     
 4  33344     2014-10-03    500      2014-10-04    50
 5  33344     2014-10-03    500      2014-10-05    504
 6  33344     2014-10-03    500      2014-10-06    332
 7  33344     2014-10-03    500      2014-10-08    56
 8  66554     2014-10-10    660      2014-10-04    150     
 9  66554     2014-10-10    660      2014-10-05    800
10  66554     2014-10-10    660      2014-10-11    400
11  66554     2014-10-10    660      2014-10-12    150
12  66554     2014-10-10    660      2014-10-13    800

我的目标是获取那些行,DebitDate位于CreditDate的5天之间,因此我尝试对数据进行子集化,其中我使用:运算符来设置日期范围

FiveDays <- allTxns$CreditDate+5 #Results in a vector which has date + 5 days

allTxns <- cbind(allTxns[1:2],FiveDays,allTxns[4:6]) #Adding the vector as a column of dataframe

newDf <- allTxns[allTxns$DebitDate %in% allTxns$CreditDate:allTxns$FiveDays]

在上面的代码中,我得到了以下逻辑错误,其中只使用了第一个元素

Warning messages:
  1: In mer32$DepositDate:mer32$FiveDays2 :
      numerical expression has 3994 elements: only the first used
  2: In mer32$DepositDate:mer32$FiveDays2 :
      numerical expression has 3994 elements: only the first used

因此,我所需的输出仅限于第一个Cust_no(12345),而不是应用于其他行。 如何确保范围条件适用于所有行?

输出错误

 >head(newDf)
 row.names  Cust_no    CreditDate   Credit     DebitDate   Debit
     1      12345     2014-10-01    200      2014-10-03    400
     2      12345     2014-10-01    200      2014-10-04    150
     4      33344     2014-10-03    500      2014-10-04    50
     5      33344     2014-10-03    500      2014-10-05    504
     6      33344     2014-10-03    500      2014-10-06    332
     7      33344     2014-10-03    500      2014-10-08    56
     8      66554     2014-10-10    660      2014-10-04    150     
     9      66554     2014-10-10    660      2014-10-05    800
    10      66554     2014-10-10    660      2014-10-11    400
    11      66554     2014-10-10    660      2014-10-12    150
    12      66554     2014-10-10    660      2014-10-13    800

正确输出

 >head(newDf)
 row.names  Cust_no    CreditDate   Credit     DebitDate   Debit
     1      12345     2014-10-01    200      2014-10-03    400
     2      12345     2014-10-01    200      2014-10-04    150
     4      33344     2014-10-03    500      2014-10-04    50
     5      33344     2014-10-03    500      2014-10-05    504
     6      33344     2014-10-03    500      2014-10-06    332
     7      33344     2014-10-03    500      2014-10-08    56         
    10      66554     2014-10-10    660      2014-10-11    400
    11      66554     2014-10-10    660      2014-10-12    150
    12      66554     2014-10-10    660      2014-10-13    800

2 个答案:

答案 0 :(得分:1)

尝试

 allTxns[with(allTxns , CreditDate < DebitDate & DebitDate <=FiveDays),]
 #    Cust_no CreditDate   FiveDays Credit  DebitDate Debit
 #1    12345 2014-10-01 2014-10-06    200 2014-10-03   400
 #2    12345 2014-10-01 2014-10-06    200 2014-10-04   150
 #4    33344 2014-10-03 2014-10-08    500 2014-10-04    50
 #5    33344 2014-10-03 2014-10-08    500 2014-10-05   504
 #6    33344 2014-10-03 2014-10-08    500 2014-10-06   332
 #7    33344 2014-10-03 2014-10-08    500 2014-10-08    56
 #10   66554 2014-10-10 2014-10-15    660 2014-10-11   400
 #11   66554 2014-10-10 2014-10-15    660 2014-10-12   150
 #12   66554 2014-10-10 2014-10-15    660 2014-10-13   800

答案 1 :(得分:1)

这个旧问题已经有了一个公认的答案。但是,我注意到问题和答案可以简化,因为创建额外的FiveDays

allTxns[with(allTxns, CreditDate <= DebitDate & DebitDate <= CreditDate + 5L), ]
   Cust_no CreditDate Credit  DebitDate Debit
1    12345 2014-10-01    200 2014-10-03   400
2    12345 2014-10-01    200 2014-10-04   150
4    33344 2014-10-03    500 2014-10-04    50
5    33344 2014-10-03    500 2014-10-05   504
6    33344 2014-10-03    500 2014-10-06   332
7    33344 2014-10-03    500 2014-10-08    56
10   66554 2014-10-10    660 2014-10-11   400
11   66554 2014-10-10    660 2014-10-12   150
12   66554 2014-10-10    660 2014-10-13   800