Question

我有一串字符。

str = c(".wow", "if.", "not.confident", "wonder", "have.difficulty", "shower")

我正在尝试更换＆＃34;。＆＃34;在带有空格的单词之间。所以它看起来像这样

".wow", "if.", "not confident", "wonder", "have difficulty", "shower"

首先，我试过

gsub("[\\w.\\w]", " ", str)
[1] "  o "            "if"              "not confident"   " onder"         
[5] "have difficulty" "sho er "

它给了我想要的空白，但切断了所有的w。然后，我试过

gsub("\\w\\.\\w", " ", str)
[1] ".wow"          "if"            "no onfident"   "wonder"       
[5] "hav ifficulty" "shower."

它保留了w，但在＆＃34;之前和之后带走了其他角色。＆＃34;。

我不能用这个

gsub("\\.", " ", str)
[1] " wow"             "if "              "not.confident"   "wonder"         
[5] "have.difficulty" "shower"

因为它会带走＆＃34;。＆＃34;不在言语之间。

Answer 1

sub('(\\w)\\.(\\w)', '\\1 \\2', str)
# [1] ".wow"            "if."             "not confident"   "wonder"         
# [5] "have difficulty" "shower"

可以通过将要分组的字符放在一组括号( ... )内来创建捕获组。反向引用回想一下捕获组所匹配的内容。

反向引用指定为（\）;后跟一个数字表示该组的编号。

使用lookaround断言：

Lookarounds是零宽度断言。它们不会“消耗”字符串上的任何字符。

sub('(?<=\\w)\\.(?=\\w)', ' ', str, perl = TRUE)

Answer 2

尝试

gsub('(\\w)\\.(\\w)', '\\1 \\2', str)
#[1] ".wow"            "if."             "not confident"   "wonder"         
#[5] "have difficulty" "shower"

或者

gsub('(?<=[^.])[.](?=[^.])', ' ', str, perl=TRUE)

或者@rawr建议

gsub('\\b\\.\\b', ' ', str, perl = TRUE)