我需要R中正则表达式的帮助。
library(stringr)
text <- "Detailed Description, {type:status-update,activityText:Closed,date:2018-06-01T12:00:15+0200,status:Closed}, {type:status-update,activityText:Inprogress,date:2018-06-01T12:00:15+0200,status:Inprogress}, Responsible:ABC"
str_extract_all(text, "status-update.a")
结果是:
[[1]]
[1] "status-update,a" "status-update,a"
以同样的方式输入以下代码
str_extract_all(text, "status-update[[:print:]]+}")
要获得以下内容:这意味着以下是我的预期输出
[[1]]
[1] "type:status-update,activityText:Closed,date:2018-06-
01T12:00:15+0200,status:Closed" "type:status-
update,activityText:Inprogress,date:2018-06-
01T12:00:15+0200,status:Inprogress"
我只想提取大括号中的位,但我得到以下错误:
Error in stri_extract_all_regex(string, pattern, simplify = simplify, :
Syntax error in regexp pattern. (U_REGEX_RULE_SYNTAX)
答案 0 :(得分:5)
curly括号是常规表达语法的一部分,因此如果要提取它们,请将转义字符放在前面。
str_extract_all(text, "\\{.+?\\}")
#[[1]]
#[1] "{type:status-update,activityText:Closed,date:2018-06-01T12:00:15+0200,status:Closed}"
#[2] "{type:status-update,activityText:Inprogress,date:2018-06-01T12:00:15+0200,status:Inprogress}"
要仅捕获{}中的文本,需要使用正则表达式的外观并查看头部选项。
str_extract_all(text, "(?<=(\\{)).+?(?=\\})")
模式的含义:
(?<= ) Look behind this match
\\{ look for the left curly bracket
.+ with at least 1 character (any character)
? do not perform a greedy match (without it will grab everything)
\\} to the right curly bracket
(?= ) look head of match