我有:
c("Enrolment in secondary school, private school")
我希望
c("secondary school)
Enrolment in
和第一个,
是固定模式。
我根本不熟悉正则表达式。有人可以帮忙吗?
答案 0 :(得分:3)
以下是一些替代方案。它们不使用任何包,如果x
是单个字符串或字符串向量(除了(3)只用于单个字符串外,它们都可以工作 - (3a)是(3)的向量版本)。
他们使用此输入:
x <- "Enrolment in secondary school, private school"
1)gsub 这一次用空字符串替换前缀和后缀:
gsub("Enrolment in |,.*", "", x)
## [1] "secondary school"
2)sub 这是相同的但是在单独的sub
次调用中:
sub(",.*", "", sub("Enrolment in ", "", x))
## [1] "secondary school"
2a)sub / substring 由于我们知道前缀的长度,我们可以用sub
替换其中一个substring
来电:
sub(",.*", "", substring(x, 14))
## [1] "secondary school"
3)strsplit 虽然人们通常不会使用此解决方案,但可以使用strsplit,如下所示:
strsplit(x, "Enrolment in |,.*")[[1]][2]
## [1] "secondary school"
3a)将(3)推广为字符串向量:
sapply(strsplit(x, "Enrolment in |,.*"), "[", 2)
## [1] "secondary school"
4)read.table 这会用逗号替换前缀,然后使用read.table
选中第二列,将其作为逗号分隔的字段读取:
read.table(text = sub("Enrolment in ", ",", x), sep = ",", as.is = TRUE)[[2]]
## [1] "secondary school"
答案 1 :(得分:2)
例如:
library(stringr)
str = c("Enrolment in secondary school, private school")
str_extract(str, "(?<=Enrolment in )([^,]+)")
#> [1] "secondary school"
您也可以通过以下方式完成:
例如:
(remove_enrol <- gsub("Enrolment in ", "", str))
#> [1] "secondary school, private school"
(result = strsplit(remove_enrol, ",")[[1]][[1]])
#> [1] "secondary school"