我有一个带有ID和起始日期的数据表,每个唯一ID代表一行。我必须使用变量'COUNT'(实际上是几个月内orig_date和close_date之间的间隔),并按顺序将ORIG_DATE复制到DATE字段中,如图所示。我尝试的代码只采用'COUNT'的第一个值(在这种情况下是3)并顺序复制ORIG_DATE。我对不同的ID有不同的COUNT。如何为每个唯一ID使用相应的COUNT,并将ORIG_DATE复制到另一个名为DATE
的列中test.data
ID COUNT SCORE VALUE ORIG_DATE CLOSE_DATE
10748 3 750 450231 2015-03-01 2015-06-01
10845 4 680 590231 2015-01-01 2015-05-01
21758 7 760 650839 2014-11-01 2015-06-01
test.panel <- test.data[rep(sequence(nrow(test.data)),COUNT)]
test.panel$DATE <- ymd(test.panel$ORIG_DATE)+ months(1:test.panel$COUNT)
以下是我正在尝试创建的数据表的结构
ID COUNT SCORE VALUE ORIG_DATE DATE
10748 3 750 450231 2015-03-01 2015-03-01
10748 3 750 450231 2015-03-01 2015-04-01
10748 3 750 450231 2015-03-01 2015-05-01
10748 3 750 450231 2015-03-01 2015-06-01
10845 4 680 590231 2015-01-01 2015-01-01
10845 4 680 590231 2015-01-01 2015-02-01
10845 4 680 590231 2015-01-01 2015-03-01
10845 4 680 590231 2015-01-01 2015-04-01
10845 4 680 590231 2015-01-01 2015-05-01
21758 7 760 650839 2014-11-01 2014-11-01
21758 7 760 650839 2014-11-01 2014-12-01
21758 7 760 650839 2014-11-01 2015-01-01
21758 7 760 650839 2014-11-01 2015-02-01
..........................................................
..........................................................
答案 0 :(得分:2)
It is actually simple to do this with data.table
. Recreating your sample data:
test.data <- read.table( text = "
ID COUNT SCORE VALUE ORIG_DATE CLOSE_DATE
10748 3 750 450231 2015-03-01 2015-06-01
10845 4 680 590231 2015-01-01 2015-05-01
21758 7 760 650839 2014-11-01 2015-06-01",
header = TRUE,
stringsAsFactors = FALSE,
colClasses = c("integer", "integer", "integer","integer", "Date", "Date") )
str(df)
Now doing what you want in data.table
:
library(data.table)
test.data <- data.table(test.data)
test.data[ , list(CLOSE_DATE = seq(ORIG_DATE, CLOSE_DATE, by = "month")),
by = c("ID", "COUNT", "SCORE", "VALUE", "ORIG_DATE")]
ID COUNT SCORE VALUE ORIG_DATE CLOSE_DATE
1: 10748 3 750 450231 2015-03-01 2015-03-01
2: 10748 3 750 450231 2015-03-01 2015-04-01
3: 10748 3 750 450231 2015-03-01 2015-05-01
4: 10748 3 750 450231 2015-03-01 2015-06-01
5: 10845 4 680 590231 2015-01-01 2015-01-01
6: 10845 4 680 590231 2015-01-01 2015-02-01
7: 10845 4 680 590231 2015-01-01 2015-03-01
8: 10845 4 680 590231 2015-01-01 2015-04-01
9: 10845 4 680 590231 2015-01-01 2015-05-01
10: 21758 7 760 650839 2014-11-01 2014-11-01
11: 21758 7 760 650839 2014-11-01 2014-12-01
12: 21758 7 760 650839 2014-11-01 2015-01-01
13: 21758 7 760 650839 2014-11-01 2015-02-01
14: 21758 7 760 650839 2014-11-01 2015-03-01
15: 21758 7 760 650839 2014-11-01 2015-04-01
16: 21758 7 760 650839 2014-11-01 2015-05-01
17: 21758 7 760 650839 2014-11-01 2015-06-01