我需要从具有这些性质值的向量中提取开始年份和结束年份。
yr<- c("June 2013 – Present (2 years 9 months)", "January 2012 – June 2013 (1 year 6 months)","2006 – Present (10 years)","2002 – 2006 (4 years)")
yr
June 2013 – Present (2 years 9 months)
January 2012 – June 2013 (1 year 6 months)
2006 – Present (10 years)
2002 – 2006 (4 years)
我期待这样的输出。有没有人有建议?
start_yr end_yr
2013 2016
2012 2013
2006 2016
2002 2006
答案 0 :(得分:4)
x <- gsub("present", "2016", yr, ignore.case = TRUE)
x <- regmatches(x, gregexpr("\\d{4}", x))
start_yr <- sapply(x, "[[", 1)
end_yr <- sapply(x, "[[", 2)
这会将开始年份和结束年份保存在2个单独的变量中,如果您希望它们只需编辑代码并生成y $ start_yr y $ end_yr
答案 1 :(得分:0)
另一种解决方案是使用stringr
包
library(stringr)
x <- str_replace(yr, "Present", 2016)
DF <- as.data.frame(str_extract_all(x, "\\d{4}", simplify = T))
names(DF) <- c("start_yr", "end_yr")
DF
你会得到
start_yr end_yr
1 2013 2016
2 2012 2013
3 2006 2016
4 2002 2006