有一个数据框,其中包含日期的三个度量(即date1,date2,date3)以及标记为“ s1”和“ s2”的其他度量。我正在尝试根据这些日期以及度量“ s1”和“ s2”创建标记为“ x1”和“ x2”的新列。例如,如果date1小于或等于date2,则列“ x1”应取值为3,否则应保留s1的值。同样,如果date1小于或等于date3,则“ x2”列应取值为3,否则应保留s2的值。下面是数据的一部分
df <-
structure(
list(
id = c(1L, 2L, 3L, 4L,5L),
date1 = c("1/4/2004", "3/8/2004", "NA", "13/10/2004","11/3/2003"),
date2 = c("8/6/2002", "11/5/2004", "3/5/2004",
"25/11/2004","21/1/2004"),
s1=c(1,2,1,"NA","NA"),
date3=c("23/6/2006", "24/12/2006", "18/2/2006", "NA","NA"),
s2=c("NA","NA",2,"NA","NA")
),
.Names = c("id", "date1","date2","s1","date3","s2"),
class = "data.frame",
row.names = c(NA,-5L),
col_types = c("numeric", "date","date","numeric","date","numeric")
)
我尝试了以下代码
df$x1<-ifelse(df$date1<=df$date2,3,s1)
df$x2<-ifelse(df$date1<=df$date3,3,s2)
它给出
id date1 date2 s1 date3 s2 x1 x2
1 1 1/4/2004 8/6/2002 1 23/6/2006 NA 3 3
2 2 3/8/2004 11/5/2004 2 24/12/2006 NA 2 NA
3 3 NA 3/5/2004 1 18/2/2006 2 1 2
4 4 13/10/2004 25/11/2004 NA NA NA 3 3
5 5 11/3/2003 21/1/2004 NA NA NA 3 3
由此,由于“ 3/8/2004”小于“ 24/12/2006”,因此“ x2”列中的“ NA”未响应该代码,因此我希望用3代替“ “ x2”列中的“ NA”。任何人都可以澄清这是为什么发生以及如何解决。非常感谢您的帮助。
答案 0 :(得分:1)
日期列在数据中是字符类型。
class(df$date1)
#[1] "character"
我们首先需要将它们转换为Date对象,然后进行比较
cols <- paste0("date", 1:3)
df[cols] <- lapply(df[cols], as.Date, "%d/%m/%Y")
df$x1<-ifelse(df$date1 <= df$date2, 3, df$s1)
df$x2<-ifelse(df$date1 <= df$date3, 3, df$s2)
df
# id date1 date2 s1 date3 s2 x1 x2
#1 1 2004-04-01 2002-06-08 1 2006-06-23 NA 1 3
#2 2 2004-08-03 2004-05-11 2 2006-12-24 NA 2 3
#3 3 <NA> 2004-05-03 1 2006-02-18 2 <NA> NA
#4 4 2004-10-13 2004-11-25 NA <NA> NA 3 NA
#5 5 2003-03-11 2004-01-21 NA <NA> NA 3 NA
或者根据您需要的输出,也可以将dplyr
与replace
一起使用
library(dplyr)
df %>%
mutate_at(vars(starts_with("date")), as.Date, "%d/%m/%Y") %>%
mutate(x1 = replace(s1, date1 <= date2, 3),
x2 = replace(s2, date1 <= date3, 3))