我在变量table_name
中以字符串2019-20的形式嵌入了有关财务年度的信息(请参见下面的示例)。我需要删除世纪( 20 19),并将其与年份部分(20 20 )结合在一起。在此示例中,成功看起来像 2020 。
df <- structure(list(table_name = c("Resident tax rates for 2016-17",
"Resident tax rates for 2016-17", "Resident tax rates for 2016-17",
"Resident tax rates for 2016-17", "Resident tax rates for 2015-16",
"Resident tax rates for 2015-16"), taxable_income = c("$18,201 – $37,000",
"$37,001 – $87,000", "$87,001 – $180,000", "$180,001 and over",
"$18,201 – $37,000", "$37,001 – $80,000"), tax_on_this_income = c("19c for each $1 over $18200",
"$3572 plus 32.5c for each $1 over $37000", "$19822 plus 37c for each $1 over $87000",
"$54232 plus 45c for each $1 over $180000", "19c for each $1 over $18200",
"$3572 plus 32.5c for each $1 over $37000"), cumm_tax_amt = c(0,
3572, 19822, 54232, 0, 3572), tax_rate = c(19, 32.5, 37, 45,
19, 32.5), threshold = c(18200, 37000, 87000, 180000, 18200,
37000)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-6L))
str_extract(df$table_name, pattern = "\\b\\d+\\b\\-(?=\\d+\\b)")
答案 0 :(得分:2)
您可以使用两个捕获组从table_name
提取年份
sub(".*(\\d{2})\\d{2}-(\\d{2})", "\\1\\2", df$table_name)
#[1] "2017" "2017" "2017" "2017" "2016" "2016"
答案 1 :(得分:1)
我们可以使用substring
,并且应该更快
paste0("20", substring(df$table_name, nchar(df$table_name)-1))
#[1] "2017" "2017" "2017" "2017" "2016" "2016"