比方说,我有一个人X程序“长”格式的人程序开始和结束日期的数据框:
df <- data.frame(person.id = c(1,1,2,2,3,3),
start.date = c("2015-01-01", "2015-01-05", "2016-05-06", "2015-04-01", "2015-07-01", "2015-01-06"),
end.date = c("2015-01-30", "2015-02-05", "2016-06-23", "2015-05-30", "2015-08-10", "2015-02-05"),
procedure = c("alpha", "beta", "alpha", "beta", "alpha", "beta"))
我如何在人员级别(即group_by(person.id)下)创建一个变量,该变量代表其“ alpha”过程的开始日期?我可以想到一些更长的解决方法,但是我想知道在group_by和mutate中是否有一种优雅的方法,例如:
df %<>%
group_by(person.id) %>%
mutate(alpha.start.date = #??)
谢谢!
答案 0 :(得分:2)
我们可以通过获取与“ alpha”“过程”相对应的“ end.date”来使用mutate
创建变量
library(dplyr)
df %>%
group_by(person.id) %>%
mutate(alpha.start.date = end.date[procedure == "alpha"])
# A tibble: 6 x 5
# Groups: person.id [3]
# person.id start.date end.date procedure alpha.start.date
# <dbl> <fct> <fct> <fct> <fct>
#1 1 2015-01-01 2015-01-30 alpha 2015-01-30
#2 1 2015-01-05 2015-02-05 beta 2015-01-30
#3 2 2016-05-06 2016-06-23 alpha 2016-06-23
#4 2 2015-04-01 2015-05-30 beta 2016-06-23
#5 3 2015-07-01 2015-08-10 alpha 2015-08-10
#6 3 2015-01-06 2015-02-05 beta 2015-08-10