我有一个df,日期格式如下。
Date Year
<chr> <dbl>
Sunday, Jul 27 2008
Tuesday, Jul 29 2008
Wednesday, July 31 (1) 2008
Wednesday, July 31 (2) 2008
是否有一种简单的方法可以实现以下格式的列和值?我还想在7月31日的两个日期删除(1)和(2)符号。
Date Year Month Day Day_of_Week
2008-07-27 2008 07 27 Sunday
答案 0 :(得分:4)
使用基数R,您可以:
dat <- data.frame(
Date = c("Sunday, Jul 27" ,"Tuesday, Jul 29", "Wednesday, July 31", "Wednesday, July 31"),
Year = rep(2008, 4),
stringsAsFactors = FALSE
)
dts <- as.POSIXlt(paste(dat$Year, dat$Date), format = "%Y %A, %B %d")
POSIXlt
提供基于列表的日期/时间参考。要查看它们,请尝试unclass(dts[1])
。
从这里可以说是学术性的:
dat$Month = 1 + dts$mon # months are 0-based in POSIXlt
dat$Day = dts$mday
dat$Day_of_Week = weekdays(dts)
dat
# Date Year Month Day Day_of_Week
# 1 Sunday, Jul 27 2008 7 27 Sunday
# 2 Tuesday, Jul 29 2008 7 29 Tuesday
# 3 Wednesday, July 31 2008 7 31 Thursday
# 4 Wednesday, July 31 2008 7 31 Thursday
答案 1 :(得分:2)
library(dplyr)
library(lubridate)
dat = data_frame(date = c('Sunday, Jul 27','Tuesday, Jul 29', 'Wednesday, July
31 (1)','Wednesday, July 31 (2)'), year=rep(2008,4))
dat %>%
mutate(date = gsub("\\s*\\([^\\)]+\\)","",as.character(date)),
date = parse_date_time(date,'A, b! d ')) -> dat1
year(dat1$date) <- dat1$year
# A tibble: 4 × 2
date year
<dttm> <dbl>
1 2008-07-27 2008
2 2008-07-29 2008
3 2008-07-31 2008
4 2008-07-31 2008