最近我偶然发现了一个问题。不幸的是,我的日期变量尚未统一记录。
我得到了类似的数据框,如下图所示
myFile
我只需要提取日期即可。因此,我想基于“日期”创建一个新变量。
Date变量应仅包含日期。该小时与我的目的无关,可以忽略。
我的目标是获取以下数据框:
Variable1 <- c(10,20,30,40,50)
Variable2 <- c("a", "b", "c", "d", "d")
Date <- c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")
df <- data.frame(Variable1, Variable2, Date)
df
最好Date的值也应采用正确的格式(日期)。
已经谢谢您了。
答案 0 :(得分:1)
tolower( # not strictly necessary, but for consistency
gsub("yesterday", format(Sys.Date()-1, "%d %B %Y"), # convert *day to dates
gsub("today", format(Sys.Date(), "%d %B %Y"),
gsub("\\s*[0-9:]*$", "", # remove the times
c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")))))
# [1] "31 october 2018" "30 october 2018" "28 october 2018" "28 october 2018" "27 october 2018"
答案 1 :(得分:1)
df$NewDate[grepl("today",df$Date)]<-Sys.Date() # Convert today to date
df$NewDate[grepl("yesterday",df$Date)]<-Sys.Date()-1 # covert yesterday to date
df$NewDate[is.na(df$NewDate)]<-df$Date[is.na(df$NewDate)] %>% as.Date(format="%d %b %Y") # Convert explicit dates to date format
class(df$NewDate)<-"Date" # Convert column to Date class
df
Variable1 Variable2 Date NewDate
1 10 a today 10:45 2018-10-31
2 20 b yesterday 3:10 2018-10-30
3 30 c 28 october 2018 5:32 2018-10-28
4 40 d 28 october 2018 8:32 2018-10-28
5 50 d 27 october 2018 5:32 2018-10-27
答案 2 :(得分:0)
使用索引的另一种解决方案。
Date <- c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")
Date <- sub("today", Sys.Date(), Date)
Date <- sub("yesterday", Sys.Date() - 1, Date)
i <- grep("[[:alpha:]]", Date)
Date[i] <- format(as.POSIXct(Date[i], format = "%d %B %Y %H:%M"), format = "%d %B %Y")
Date[-i] <- format(as.POSIXct(Date[-i]), format = "%d %B %Y")
Date
#[1] "31 October 2018" "30 October 2018" "28 October 2018"
#[4] "28 October 2018" "27 October 2018"
然后我注意到solution by user r2evans,它将所有内容都转换为小写。因此,如有必要,以
结尾Date <- tolower(Date)