删除字符串的开头和结尾

时间:2016-10-10 16:50:06

标签: r

我有:

c("Enrolment in secondary school, private school")

我希望

c("secondary school)

Enrolment in和第一个,是固定模式。

我根本不熟悉正则表达式。有人可以帮忙吗?

2 个答案:

答案 0 :(得分:3)

以下是一些替代方案。它们不使用任何包,如果x是单个字符串或字符串向量(除了(3)只用于单个字符串外,它们都可以工作 - (3a)是(3)的向量版本)。

他们使用此输入:

x <- "Enrolment in secondary school, private school"

1)gsub 这一次用空字符串替换前缀和后缀:

gsub("Enrolment in |,.*", "", x)
## [1] "secondary school"

2)sub 这是相同的但是在单独的sub次调用中:

sub(",.*", "", sub("Enrolment in ", "", x))
## [1] "secondary school"

2a)sub / substring 由于我们知道前缀的长度,我们可以用sub替换其中一个substring来电:

sub(",.*", "", substring(x, 14))
## [1] "secondary school"

3)strsplit 虽然人们通常不会使用此解决方案,但可以使用strsplit,如下所示:

strsplit(x, "Enrolment in |,.*")[[1]][2]
## [1] "secondary school"

3a)将(3)推广为字符串向量:

sapply(strsplit(x, "Enrolment in |,.*"), "[", 2)
## [1] "secondary school"

4)read.table 这会用逗号替换前缀,然后使用read.table选中第二列,将其作为逗号分隔的字段读取:

read.table(text = sub("Enrolment in ", ",", x), sep = ",", as.is = TRUE)[[2]]
## [1] "secondary school"

答案 1 :(得分:2)

例如:

library(stringr)

str = c("Enrolment in secondary school, private school")

str_extract(str, "(?<=Enrolment in )([^,]+)")
#> [1] "secondary school"

您也可以通过以下方式完成:

  1. 删除“注册”
  2. 分享逗号
  3. 采取第一部分
  4. 例如:

    (remove_enrol <- gsub("Enrolment in ", "", str))
    #> [1] "secondary school, private school"
    
    (result = strsplit(remove_enrol, ",")[[1]][[1]])
    #> [1] "secondary school"