我遇到的问题应该很容易解决。 我想替换以模式开头的字符串中的整个单词。
cudaDeviceSynchronize
到目前为止,我遇到的最好的是
kernel_A
我真的没有想法了。
我也很满意> test <- "i really wasn aware and i wasnt aware at all. but i wasn't aware. just wasn't."
## this is what i want
> output
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't."
编辑:看来我的问题有点太具体了。所以,我正在添加其他测试用例。基本上,我不知道会跟随什么字符&#34; wasn&#34;我想将所有人转换为不是
# this is what get, but it's not correct
> gsub("\\<wasn*.\\>", "wasn't", test)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't't aware. Just wasn't't."
答案 0 :(得分:2)
您可以使用perl提供的负面展望.. pattern=wasn(?!')t*
gsub("wasn(?!')t*","wasn't",test,perl=T)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't."
或者你可以这样做:
gsub("wasn'*t*","wasn't",test)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't."
对于第二个期望的输出:
gsub("wasn'*t*[.]?","wasn't",test)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't"
编辑后:
gsub("wasn[^. ]*","wasn't",test)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't. this wasn't meant to be. it wasn't simple"
答案 1 :(得分:1)
我建议这样的解决方案:
test <- c("i really wasn aware and i wasnt aware at all. but i wasn't aware. just wasn't. this wasn45'e meant to be. it wasn@'re simple", "Wasn&^$tt that nice?", "You say wasnmmmt?", "No, he wasn&#t#@$.", "She wasn%#@t##, I know.")
gsub("\\b(wasn)\\S*\\b(?:\\S*(\\p{P})\\B)?", "\\1't\\2", test, ignore.case=TRUE, perl=TRUE)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't. this wasn't meant to be. it wasn't simple"
[2] "Wasn't that nice?"
[3] "You say wasn't?"
[4] "No, he wasn't."
[5] "She wasn't, I know."
此解决方案考虑了wasn*
出现在字符串开头或大写时的情况,并且不会替换尾随标点符号。
模式详情
\\b
- 字边界(wasn)
- 捕获第1组(稍后在替换模式中使用\\1
引用):wasn
子字符串(由于ignore.case=TRUE
而不区分大小写)\\S*\\b
- 除了空格之外的任何0 +字符,后跟字边界(?:\\S*(\\p{P})\\B)?
- 一个可选的非捕获组,匹配1或0次出现
\\S*
- 0+非空白字符(\\p{P})
- 捕获第2组(稍后在替换模式中使用\\2
):任何1个标点符号(不是符号!\p{P}
不等于[:punct:]
!)符号后面没有...... \\B
- 一个字母,数字或_
(它是一个非字边界模式)。 对于更混乱的字符串(如She wasn%#@t##,$#^ I know.
),当标点符号可以位于其他标点符号中时,您可以使用自定义括号表达式并在\S*
处添加gsub("\\b(wasn)\\S*\\b(?:\\S*([?!.,:;])\\S*)?", "\\1't\\2", test, ignore.case=TRUE, perl=TRUE)
来限制要停止的标点符号结束:
{{1}}
请参阅regex demo。
答案 2 :(得分:0)
为什么不保持简单并将wasn
替换为wasn't
的任何单词替换为<{1}}?
test2 <- paste0(
"i really wasn aware and i wasnt aware at all. but i wasn't aware. just",
"wasn't. this wasn45'e meant to be. it wasn@'re simple"
)
gsub("wasn[^ ]*", "wasn't", test2)
[1] "i really wasn't aware and i wasn't aware at all. but i wasn't aware. just wasn't this wasn't meant to be. it wasn't simple"
如果处理大写,那么你可以将ignore.case = TRUE
添加到gsub()。