我想根据first.date for fruit =='apple'为first.date赋予每个唯一ID相同的列值。
这就是我所拥有的:
names dates fruit first.date
1 john 2010-07-01 kiwi <NA>
2 john 2010-09-01 apple 2010-09-01
3 john 2010-11-01 banana <NA>
4 john 2010-12-01 orange <NA>
5 john 2011-01-01 apple 2010-09-01
6 mary 2010-05-01 orange <NA>
7 mary 2010-07-01 apple 2010-07-01
8 mary 2010-07-01 orange <NA>
9 mary 2010-09-01 apple 2010-07-01
10 mary 2010-11-01 apple 2010-07-01
这就是我想要的:
names dates fruit first.date
1 john 2010-07-01 kiwi 2010-09-01
2 john 2010-09-01 apple 2010-09-01
3 john 2010-11-01 banana 2010-09-01
4 john 2010-12-01 orange 2010-09-01
5 john 2011-01-01 apple 2010-09-01
6 mary 2010-05-01 orange 2010-07-01
7 mary 2010-07-01 apple 2010-07-01
8 mary 2010-07-01 orange 2010-07-01
9 mary 2010-09-01 apple 2010-07-01
10 mary 2010-11-01 apple 2010-07-01
这是我的灾难性尝试:
getdates$first.date[is.na]<-getdates[getdates$first.date & getdates$fruit=='apple',]
提前谢谢
可重现的DF
names<-as.character(c("john", "john", "john", "john", "john", "mary", "mary","mary","mary","mary"))
dates<-as.Date(c("2010-07-01", "2010-09-01", "2010-11-01", "2010-12-01", "2011-01-01", "2010-05-01", "2010-07-01", "2010-07-01", "2010-09-01", "2010-11-01"))
fruit<-as.character(c("kiwi","apple","banana","orange","apple","orange","apple","orange", "apple", "apple"))
first.date<-as.Date(c(NA, "2010-09-01",NA,NA, "2010-09-01", NA, "2010-07-01", NA, "2010-07-01","2010-07-01"))
getdates<-data.frame(names,dates,fruit, first.date)
答案 0 :(得分:3)
当first.date
和apple
(对于给定名称)有重复条目时,不清楚您想要做什么,这只是第一个:
library(data.table)
dt = data.table(getdates)
dt[, first.date := first.date[fruit == 'apple'][1], by = names]
dt
# names dates fruit first.date
# 1: john 2010-07-01 kiwi 2010-09-01
# 2: john 2010-09-01 apple 2010-09-01
# 3: john 2010-11-01 banana 2010-09-01
# 4: john 2010-12-01 orange 2010-09-01
# 5: john 2011-01-01 apple 2010-09-01
# 6: mary 2010-05-01 orange 2010-07-01
# 7: mary 2010-07-01 apple 2010-07-01
# 8: mary 2010-07-01 orange 2010-07-01
# 9: mary 2010-09-01 apple 2010-07-01
#10: mary 2010-11-01 apple 2010-07-01