根据R中的日期(季度)创建排名

时间:2016-06-28 15:07:30

标签: r dataframe data.table

我们将从以下DataTable开始:

    id       date
 1:  1 2016-03-31
 2:  1 2015-12-31
 3:  1 2015-09-30
 4:  1 2015-06-30
 5:  1 2015-03-31
 6:  2 2016-03-31
 7:  2 2015-09-30
 8:  2 2015-06-30
 9:  2 2015-03-31
10:  2 2014-12-31

library(data.table)
DT <- data.table(c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2),
                 as.IDate(c("2016-03-31", "2015-12-31", "2015-09-30", "2015-06-30", 
                   "2015-03-31", "2016-03-31", "2015-09-30", "2015-06-30",
                   "2015-03-31", "2014-12-31")))
setnames(DT, c("id", "date"))

对于每个唯一ID,我想创建一个排名。特定ID的最新日期应该等级为0.之后,我应该从该日期开始删除3个月(我不考虑几天)以获得排名-1的日期。我必须重复这一点,直到排名-19。添加包含排名的新列后。

最终输出看起来像那样(注意id = 2的排名):

    id       date rank_year
 1:  1 2016-03-31         0
 2:  1 2015-12-31        -1
 3:  1 2015-09-30        -2
 4:  1 2015-06-30        -3
 5:  1 2015-03-31        -4
 6:  2 2016-03-31         0
 7:  2 2015-09-30        -2
 8:  2 2015-06-30        -3
 9:  2 2015-03-31        -4
10:  2 2014-12-31        -5

2 个答案:

答案 0 :(得分:5)

我愿意(从@ akrun的回答中借用order):

DT[order(-date), rank_year := {
    z = month(date) + year(date)*12
    as.integer( (z - z[1L])/3 )
}, by=id]

    id       date rank_year
 1:  1 2016-03-31         0
 2:  1 2015-12-31        -1
 3:  1 2015-09-30        -2
 4:  1 2015-06-30        -3
 5:  1 2015-03-31        -4
 6:  2 2016-03-31         0
 7:  2 2015-09-30        -2
 8:  2 2015-06-30        -3
 9:  2 2015-03-31        -4
10:  2 2014-12-31        -5

答案 1 :(得分:1)

我们也可以

DT[order(id, -date)][, rank_year := 
          -1*c(0,cumsum(as.numeric(abs(diff(date)))))%/%90 , by = id][]
#    id       date rank_year
#1:  1 2016-03-31         0
#2:  1 2015-12-31        -1
#3:  1 2015-09-30        -2
#4:  1 2015-06-30        -3
#5:  1 2015-03-31        -4
#6:  2 2016-03-31         0
#7:  2 2015-09-30        -2
#8:  2 2015-06-30        -3
#9:  2 2015-03-31        -4
#10: 2 2014-12-31        -5