匹配单词第二次出现后的所有内容

时间:2020-02-25 17:01:24

标签: r regex

我在R中有一个字符串,我想使用正则表达式匹配第二次出现单词后的所有内容

Ex:在is第二次出现后返回所有内容

"This is a string of example. this is what i should get in return".

预期产量

what i should get in return

我尝试过类似([^is]+)(?:is[^is]+){2}$之类的方法,但是它不起作用。

谢谢。

3 个答案:

答案 0 :(得分:1)

您可以使用类似PCRE的模式

^(?>.*?\sis\s+){2}\K.*

请参见regex demo

详细信息

  • ^-字符串的开头
  • (?>.*?\\sis\\s+){2}-与以下两次出现匹配的原子团:
    • .*-尽可能多的除换行符以外的0+个字符
    • \s-空格
    • is-单词is
    • \s+-超过1个空格
  • \K-匹配重置运算符
  • .*-该行的其余部分。

R demo

x <- "This is a string of example. this is what i should get in return"
regmatches(x, regexpr("^(?>.*?\\sis\\s+){2}\\K.*", x, perl=TRUE))
## => [1] "what i should get in return"

使用stringr

stringr::str_match(x, "^(?>.*?\\sis\\s+){2}(.*)")[,2]

答案 1 :(得分:1)

使用@Configuration public class Config { @Value("${baseUrl}") private String baseUrl; protected class AuthApiClient extends ApiClient { public AuthApiClient() { super(); } @Override public <T> T invokeAPI(final String path, final HttpMethod method, final MultiValueMap<String, String> queryParams, final Object body, final HttpHeaders headerParams, final MultiValueMap<String, Object> formParams, final List<MediaType> accept, final MediaType contentType, final String[] authNames, final ParameterizedTypeReference<T> returnType) throws RestClientException { final HttpBasicAuth auth = new HttpBasicAuth(); auth.setUsername("myUsername"); auth.setPassword("myPassword"); auth.applyToParams(queryParams, headerParams); return super.invokeAPI(path, method, queryParams, body, headerParams, formParams, accept, contentType, authNames, returnType); } } @Bean @Primary @Qualifier("MyApiClient") public AuthApiClient myApiClient() { final AuthApiClient apiClient = new AuthApiClient(); apiClient.setBasePath(this.baseUrl); return apiClient; } } 软件包,可以将stringrstr_locate_all()结合使用。这将提取str_sub()[2,)中s的第二个实例("is")所在的位置。并添加一个([, 2]),因此它在+ 1结束处的右边开始一个字符。

"is"

数据

str_sub(text, str_locate_all(text, "\\bis\\b")[[1]][2, 2] + 1)
[1] " what i should get in return"

答案 2 :(得分:0)

您可以使用 unglue

txt <- "This is a string of example. this is what i should get in return"

library(unglue)
unglue_vec(txt, "{=.*?} is {=.*?} is {x}")
#> [1] "what i should get in return"

reprex package(v0.3.0)于2020-02-26创建