我试图使用两个ifelse语句来创建一个新的日期变量,该变量使一系列假设填补现有日期变量的空白。这是我的意思的一个例子:
id EffectiveDate EffectiveYear ED_NA EY_NA NewEffectiveDate
1 a 1972-10-05 1972 FALSE FALSE 1972-10-05
2 a <NA> 1985 TRUE FALSE 1985-01-01
3 a 1988-11-12 1988 FALSE FALSE 1988-11-12
4 b 2011-09-05 2011 FALSE FALSE 2011-09-05
5 b <NA> NA TRUE TRUE 2011-09-05
6 b <NA> 2012 TRUE FALSE 2012-01-01
7 c 2012-11-11 2012 FALSE FALSE 2012-11-11
8 c 2013-05-15 2013 FALSE FALSE 2013-05-15
id的快速代码:EY_NA =
id <- c("a","a","a","b","b","b","c","c")
EffectiveDate <- c("1972-10-05",NA,"1988-11-12","2011-09-05",NA,NA,"2012-11-11","2013-05-15")
EffectiveYear <- c(1972,1985,1988,2011,NA,2012,2012,2013)
tdat <- data.frame(id, EffectiveDate, EffectiveYear)
tdat$ED_NA <- is.na(tdat$EffectiveDate)
tdat$EY_NA <- is.na(tdat$EffectiveYear)
我在这个例子中试图创建的是“NewEffectiveDate”变量。用简单的英语,我想要的是,哪里缺少EffectiveDate数据但是没有缺少EffectiveYear数据,假设NewEffectiveDate等于EffectiveYear的1月1日。如果缺少EffectiveDate和EffectiveYear数据,则假定先前观察的EffectiveDate。当然,最后,如果没有缺少EffectiveDate数据,请选择EffectiveDate。
以下是我用来尝试解决问题的最新代码:
tdat %>% mutate(NewEffectiveDate = ifelse(ED_NA == 1 & EY_NA == 0,
as.Date(paste(EffectiveYear, 1, 1, sep="-")),
ifelse(ED_NA == 1 & EY_NA == 1),
as.Date(lag(EffectiveDate)),
EffectiveDate
))
当我尝试这个特定代码时,我收到一条错误消息:错误:未使用的参数(as.Date(c(NA,1,NA,2,3,NA,NA,4)),c(1 ,NA,2,3,NA,NA,4,5))
我搜索了类似的问题,例如“ifelse concatenate date”及其中的一些变体,但是没有找到任何似乎适用于这个特定问题的东西。
我是R(和CLI)的新手,所以如果我忽略了一个非常明显的解决方案,我会提前道歉。从Excel到R的过渡很有意思,但是在做一些看似相对简单的任务时经常很痛苦(尽管dplyr包非常有帮助)。
答案 0 :(得分:1)
id <- c("a","a","a","b","b","b","c","c")
EffectiveDate <- c("1972-10-05",NA,"1988-11-12","2011-09-05",NA,NA,"2012-11-11","2013-05-15")
EffectiveYear <- c(1972,1985,1988,2011,NA,2012,2012,2013)
tdat <- data.frame(id, EffectiveDate, EffectiveYear,
stringsAsFactors=FALSE)
library(zoo)
tdat %>%
mutate(NewEffectiveDate = ifelse(!is.na(EffectiveDate),
EffectiveDate,
ifelse(is.na(EffectiveDate) & !is.na(EffectiveYear),
paste0(EffectiveYear, "-01-01"),
NA)),
NewEffecitveDate = na.locf(NewEffectiveDate))
这应该可以满足您的需求。我建议使用na.locf
包中的zoo
(最后一个),而不是尝试处理之前的日期问题。
答案 1 :(得分:1)
你可以做到
tdat$EffectiveDate <- as.Date(tdat$EffectiveDate)
tdat %>% mutate(NewEffectiveDate = as.Date(
ifelse(!is.na(EffectiveDate), EffectiveDate,
ifelse(!is.na(EffectiveYear), as.Date(paste(EffectiveYear, 1, 1, sep="-")),
lag(EffectiveDate)))
)) -> res
res
# id EffectiveDate EffectiveYear NewEffectiveDate
# 1 a 1972-10-05 1972 1972-10-05
# 2 a <NA> 1985 1985-01-01
# 3 a 1988-11-12 1988 1988-11-12
# 4 b 2011-09-05 2011 2011-09-05
# 5 b <NA> NA 2011-09-05
# 6 b <NA> 2012 2012-01-01
# 7 c 2012-11-11 2012 2012-11-11
# 8 c 2013-05-15 2013 2013-05-15
答案 2 :(得分:0)
您的ifelse
阻止问题似乎很早就关闭了第二个区块的括号而没有给出yes
或no
参数,并且您给了一个额外的参数到第一个ifelse
区块。
这应该有效:
tdat %>% mutate(NewEffectiveDate = ifelse(ED_NA == 1 & EY_NA == 0,
as.Date(paste(EffectiveYear, 1, 1, sep="-")),
ifelse(ED_NA == 1 & EY_NA == 1,
as.Date(lag(EffectiveDate))),
EffectiveDate))