Question

我正在做一些文本挖掘，我想从我的文本中删除撇号（删除它）。我尝试使用gsub如下，但它不起作用

text <- "\"branch"

removeSpecialChars <- function(x){
     result <- gsub('"',x)
     return(result)
}

without <- removeSpecialChars(text)

所需的输出将是分支而不是“分支。感谢您的帮助

编辑更进一步（我正在尝试清理文本）。

输入是包含许多不同字符串的列表。例如

Input <- list(c("e","b", "stackoverflow", "\"branch"))

cleanCorpus <- function(corpus){
  corpus.tmp <- tm_map(corpus, removePunctuation,preserve_intra_word_dashes = TRUE)

  removeSpecialChars <- function(x){
    result <- gsub('"', "",x)
    return(result)
  }
  corpus.tmp <- removeSpecialChars(corpus.tmp)

  corpus.tmp <- tm_map(corpus.tmp, stripWhitespace)
  corpus.tmp <- tm_map(corpus.tmp, content_transformer(tolower))
  corpus.tmp <- tm_map(corpus.tmp, removeWords, stopwords("english"))
  return(corpus.tmp)
}
result <- cleanCorpus(Input)

Answer 1

我们需要使用replacement

gsub('"', "", text)
#[1] "branch"

数据

text <- "\"branch"

Answer 2

result <- gsub("\"",text)将为您效劳。你需要覆盖“使用。

删除R中的特殊撇号

2 个答案:

数据