拆分具有2个相似单词的句子

时间:2018-09-16 06:34:03

标签: r

我需要将句子中的单词分开,但是这里有一些问题

word.list1 <- c("rose","location","criminal","lotus","check","sing","single")

如果您看上面的代码,sing和single是我列表中的2个字

现在我有一个句子

a <- "rosesinglelocationcriminalcheck"

以下代码将分割单词

for (word in word.list) {
  a <- gsub(word, paste0(word, " "), a)     }

> a1

[1] "rose sing le location criminal check "

实际上我需要如下输出

> a1

[1] "rose single location criminal check "

由于我的名字,我既唱歌又单身。该代码实际上正在唱歌。有什么地方可以拆分单词

1 个答案:

答案 0 :(得分:1)

对于这种特殊情况,只需在gsub中切换空白:

word.list <- c("rose","location","criminal","lotus","check","sing","single")

a <- "rosesinglelocationcriminalcheck"
for (word in word.list) {
  a <- gsub(word, paste0(" ", word), a)     
}
a
#> [1] " rose  single location criminal check"

但是我认为这种方法非常有限。 singlet 呢? 唱歌 let singlet 都是有意义的词。