基于日期计算的子集化数据集

时间:2012-10-15 15:10:05

标签: r

我有这个数据框。我想从中创建另一个数据框,其中包括Trade_Date,Trades和Rejected。被拒绝的字段需要具有Closing_Date中的值,该值必须符合此条件 - Trade_Date + 1天。

X

Trade_Date   Trades   Closing_Date   Rejected
9/16/2011   4126528     9/16/2011   15
9/19/2011   2565282     9/17/2011   33
9/20/2011   2953963     9/20/2011   30
9/21/2011   3255800     9/21/2011   6
9/22/2011   2862185     9/22/2011   21
9/23/2011   2405590     9/23/2011   30
9/26/2011   3196284     9/24/2011   30
9/27/2011   3761367     9/27/2011   15
9/28/2011   3198177     9/28/2011   9
9/29/2011   3255345     9/29/2011   6
9/30/2011   3810356     9/30/2011   12
10/3/2011   3817093     10/1/2011   21

例如,我的下一个df需要是这样的:

Trade_Date      Trades    Rejected
9/16/2011       4126528   33
9/19/2011       2565282   30
9/20/2011       2953963   6

由于行数很多,我需要以编程方式执行此操作。可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

示例数据框中的所有行都不符合您指定的条件,因此我将略微更改它以演示一种简单的方法:

#Read the data in
dat <- read.table(text = "Trade_Date   Trades   Closing_Date   Rejected
 9/16/2011   4126528     9/16/2011   15
 9/19/2011   2565282     9/17/2011   33
 9/20/2011   2953963     9/20/2011   30
 9/21/2011   3255800     9/21/2011   6
 9/22/2011   2862185     9/22/2011   21
 9/23/2011   2405590     9/23/2011   30
 9/26/2011   3196284     9/24/2011   30
 9/27/2011   3761367     9/27/2011   15
 9/28/2011   3198177     9/28/2011   9
 9/29/2011   3255345     9/29/2011   6
 9/30/2011   3810356     9/30/2011   12
 10/3/2011   3817093     10/1/2011   21",sep = "",header = TRUE)

#Convert to date
dat$Trade_Date <- as.Date(dat$Trade_Date,"%m/%d/%Y")
dat$Closing_Date <- as.Date(dat$Closing_Date,"%m/%d/%Y")

#Alter some rows to meet your criteria
dat$Closing_Date[1:3] <- dat$Trade_Date[1:3] + 1

#Then we can simply subset the data
> dat[dat$Closing_Date == dat$Trade_Date+1,]
  Trade_Date  Trades Closing_Date Rejected
1 2011-09-16 4126528   2011-09-17       15
2 2011-09-19 2565282   2011-09-20       33
3 2011-09-20 2953963   2011-09-21       30