在每个ID
组中,我只想标记那些拥有n
年(过去数据) AND 也有未来一年的年份。因此,例如2020年,总会得到0
,因为数据中没有2021。
ID <- c(rep("A5", 15), rep("B2", 15))
product <- rep(rep(c("prod1","prod2","prod3", "prod55", "prod4", "prod9", "prod83"),3),2)
# start <- c(rep("01.01.2016", 3), rep("01.01.2015", 3), rep("01.01.2014",3),
# rep("01.01.2013",3), rep("01.01.2012",3))
start <- rep(c(rep(2016, 3), rep(2017, 3), rep(2018 ,3),
rep(2019,3), rep(2020,3)),2)
prodID <- rep(c(3,1,2,3,1,2,3,1,2,3,2,1,3,1,2),2)
mydata <- cbind(ID, product[1:15], start, prodID)
mydata <- as.data.table(mydata)
所以结果类似于n=3
:
ID V2 start result
1: A5 prod1 2016 0
2: A5 prod2 2016 0
3: A5 prod3 2016 0
4: A5 prod55 2017 0
5: A5 prod4 2017 0
6: A5 prod9 2017 0
7: A5 prod83 2018 1
8: A5 prod1 2018 1
9: A5 prod2 2018 1
10: A5 prod3 2019 1
11: A5 prod55 2019 1
12: A5 prod4 2019 1
13: A5 prod9 2020 0
14: A5 prod83 2020 0
15: A5 prod1 2020 0
16: B2 prod1 2016 0
17: B2 prod2 2016 0
18: B2 prod3 2016 0
19: B2 prod55 2017 0
20: B2 prod4 2017 0
21: B2 prod9 2017 0
22: B2 prod83 2018 1
23: B2 prod1 2018 1
24: B2 prod2 2018 1
25: B2 prod3 2019 1
26: B2 prod55 2019 1
27: B2 prod4 2019 1
28: B2 prod9 2020 0
29: B2 prod83 2020 0
30: B2 prod1 2020 0
答案 0 :(得分:1)
我们可以使用between
:
library(data.table)
n = 3
mydata[, result := +(between(start, min(start) + n - 1, max(start) - 1)), ID]
返回
mydata
# ID V2 start result
# 1: A5 prod1 2016 0
# 2: A5 prod2 2016 0
# 3: A5 prod3 2016 0
# 4: A5 prod55 2017 0
# 5: A5 prod4 2017 0
# 6: A5 prod9 2017 0
# 7: A5 prod83 2018 1
# 8: A5 prod1 2018 1
# 9: A5 prod2 2018 1
#10: A5 prod3 2019 1
#11: A5 prod55 2019 1
#12: A5 prod4 2019 1
#13: A5 prod9 2020 0
#14: A5 prod83 2020 0
#15: A5 prod1 2020 0
#16: B2 prod1 2016 0
#17: B2 prod2 2016 0
#18: B2 prod3 2016 0
#19: B2 prod55 2017 0
#20: B2 prod4 2017 0
#21: B2 prod9 2017 0
#22: B2 prod83 2018 1
#23: B2 prod1 2018 1
#24: B2 prod2 2018 1
#25: B2 prod3 2019 1
#26: B2 prod55 2019 1
#27: B2 prod4 2019 1
#28: B2 prod9 2020 0
#29: B2 prod83 2020 0
#30: B2 prod1 2020 0
# ID V2 start result
between
返回布尔值TRUE
/ FALSE
,指示值是否在两个值之间。等效的方式是:
mydata[, result := +(start >= min(start) + n - 1 & start <= max(start) - 1), ID]
+
将布尔值(TRUE / FALSE)转换为整数值(1/0)。
数据
在创建数据时不要使用cbind
,请直接使用data.frame
或data.table
。
mydata <- data.table(ID, product[1:15], start)