在下面有代词动词组合时,我想用“狂热者”一词替换“狂热者”一词。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\\b[Ff]an)(s?\\b)",
'\\1\\2atic\\3',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] "He's the bigest He'saticHe's I know."
我知道编号的反向引用是指第一组的内部括号。有没有一种方法可以让它们仅参考三个组中的外三个括号:伪代码中的(stuff before fan)(fan)(s\\b)
。
我知道我的正则表达式可以替换wll组,因为我知道这是有效的。这只是回溯部分。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\\b[Ff]an)(s?\\b)",
'',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] " I know."
所需的输出:
## [1] "He's the bigest fanatic I know."
inputs <- c(
"He's the bigest fan I know.",
"I am a huge fan of his.",
"I know she has lots of fans in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
outputs <- c(
"He's the bigest fanatic I know.",
"I am a huge fanatic of his.",
"I know she has lots of fanatics in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
答案 0 :(得分:4)
我知道您在捕获组过多方面遇到了麻烦。将您不感兴趣的内容变成non-capturing,或者删除那些多余的内容:
((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(Fan)(s?)\b
请参见regex demo
请注意,由于您使用[Ff]
参数,因此F
可以转换为f
或ignore.case=TRUE
。
gsub(
"((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\\b(fan)(s?)\\b",
'\\1\\2atic\\3',
inputs,
perl = TRUE, ignore.case = TRUE
)
输出:
[1] "He's the bigest fanatic I know."
[2] "I am a huge fanatic of his."
[3] "I know she has lots of fans in his club"
[4] "I was cold and turned on the fan"
[5] "An air conditioner is better than 2 fans at cooling."