我有看起来像这样的数据
df <- data.frame(
ID = c(rep("A12345",5), rep("A23456",10), rep("A34567",5), "A45678", "A67891", rep("A78910",8), "A91011",
rep("A10111",4), rep("A11121",3), "A12131", "A16731"),
indicator = c(rep("colchicine",5), rep("febuxosat",9), "hosps", rep("colchicine",5), "hosps", "colchicine",
rep("allopurinol",8), "allopurinol",
rep("colchicine",3), "hosps", rep("colchicine",3), "colchicine", "allopurinol"),
Date = c("2004-12-08", "2005-01-28", "2005-07-15", "2005-08-23", "2005-11-30", "2007-02-01", "2007-07-20", "2014-06-03",
"2008-04-17",
"2008-12-19", "2009-09-09", "2010-02-24", "2010-11-01", "2010-12-03", "2011-08-10", "2012-11-05", "2012-12-17",
"2012-12-19", "2013-10-03", "2013-12-11", "2014-03-26", "2015-11-12", "2014-08-07", "2008-01-31", "2008-02-21",
"2008-09-19", "2008-11-06", "2009-01-06", "2009-01-14", "2009-03-25", "2009-03-27", "2009-06-18", "2009-08-18",
"2009-09-08", "2009-11-13", "2010-01-21", "2010-04-19", "2010-07-07", "2010-08-06", "2010-08-19")
)
我想做的是,如果对于ID变量,存在任何指标==“ hosps”的实例,然后创建一个新的指标,称为“ hosp_ever”,它等于1。 ,指标变量中再也没有“ hosps”的实例,那么新的“ hosp_ever”变量等于0。
这是我尝试执行的操作:
df_group <- df %>%
group_by(ID) %>%
mutate(hosp_ever = ifelse(indicator == "hosps", "Y", "N"))
这将创建新的hosp_ever变量,但仅为指标== hosps的实例分配“ Y”,而hosp_ever未正确分配给指标!= hosps的ID,但它们在某个日期发生了hosps事件。
这就是我希望输出的样子
df_group <- df %>%
mutate(hosps_ever = c(rep("N",5), rep("Y",9), "Y", rep("N",5), "Y", "N",
rep("N",8), "N",
rep("Y",3), "Y", rep("N",3), "N", "N"))
有人知道我要去哪里吗?
谢谢
答案 0 :(得分:1)
您可以使用any
检查某个indicator
组中至少ID
个等于hosps
library(dplyr)
df_group <- df %>%
group_by(ID) %>%
mutate(hosp_ever = ifelse(any(indicator == "hosps"), "Y", "N"))
df_group
#> # A tibble: 40 x 4
#> # Groups: ID [11]
#> ID indicator Date hosp_ever
#> <fct> <fct> <fct> <chr>
#> 1 A12345 colchicine 2004-12-08 N
#> 2 A12345 colchicine 2005-01-28 N
#> 3 A12345 colchicine 2005-07-15 N
#> 4 A12345 colchicine 2005-08-23 N
#> 5 A12345 colchicine 2005-11-30 N
#> 6 A23456 febuxosat 2007-02-01 Y
#> 7 A23456 febuxosat 2007-07-20 Y
#> 8 A23456 febuxosat 2014-06-03 Y
#> 9 A23456 febuxosat 2008-04-17 Y
#> 10 A23456 febuxosat 2008-12-19 Y
#> # ... with 30 more rows
由reprex package(v0.2.0.9000)创建于2018-06-26。