说我有一个角色矢量测试:
test = c("2014-03-02","2012-09-08","2010-12-11")
我希望结果是数字年代如此:
c(2014,2012,2010)
我如何在一个简单的'办法?目前以下工作正常,但不是很漂亮':
test = c("2014-03-02","2012-09-08","2010-12-11")
tmp = strsplit(test,split="-")
myYears = as.numeric(unlist(lapply(tmp, function(x) x[[1]])))
我确信这可以使用正则表达式以不同的方式完成" \ d {4}"在某种程度上?
答案 0 :(得分:3)
你可以尝试:
as.numeric(substr(test,1,4))
或者:
as.numeric(gsub("^([0-9]{4}).+$","\\1",test))
另一种选择:
as.numeric(strftime(test,format="%Y"))
或者:
as.POSIXlt(test)$year+1900
答案 1 :(得分:2)
您可以使用sub
功能。只需将第一个-
中的所有字符替换为空字符串中的最后一个字符。
> test = c("2014-03-02","2012-09-08","2010-12-11")
> sub("-.*", "", test)
[1] "2014" "2012" "2010"
> as.numeric(sub("-.*", "", test))
[1] 2014 2012 2010
答案 2 :(得分:0)
为什么不执行以下操作,从而完成您在一行中所做的一切:
as.numeric(sapply(strsplit(test, "-"), '[', 1))
这1.使用 - 分割矢量 - 2.选择第一项并将其简化为矢量3.将其转换为数字
答案 3 :(得分:0)
test = c("2014-03-02","2012-09-08","2010-12-11")
years <- as.numeric(regmatches(test , regexpr("^\\d+" , test)))
test = c("2014-03-02","2012-09-08","2010-12-11")
# this function return two things
# 1. The index of the first match for this regular expression
# 2. the length of characters that matches our regular expressions
indices <- regexpr("^\\d+" , test)
# [1] 1 1 1
#attr(,"match.length")
#[1] 4 4 4
# 1. in this case the index of our first match is the first character for each date
# as you see in the result it returns 1
# 2. our regular expression matches 4 didgits from the beginning of string
# so the length in this case is 4
# then we have now indices variable which represents the index of the first matched
# character , and how many characters it matches starting from the first match ,
# then pass this to regmatches function
# this will use the result of indices to to get only the matched part from our input
matches <- regmatches(test , indices)
# [1] "2014" "2012" "2010"
# Look again at indices variable , regmatches() will substring
# the input starting from the first index , and how many characters
# it will substring ? only 4 characters based on the result we get from
# indices variables
希望这能澄清代码
答案 4 :(得分:0)
即使您要求base R
解决方案,我也只是想让您知道有一种非常简单的方法可以使用函数numeric
从year
直接提取年份{ {3}}:
test = c("2014-03-02","2012-09-08","2010-12-11")
library(lubridate)
year_test <- year(test)
year_test
#[1] 2014 2012 2010
is.numeric(year_test)
# [1] TRUE