Question

我想将多个空格合并到单个空格中（空格也可以是制表符）并删除尾随/前导空格。

例如......

string <- "Hi        buddy        what's up    Bro"

到

"Hi buddy what's up bro"

我检查了Regex to replace multiple spaces with a single space处给出的解决方案。请注意，不要将\ t或\ n作为玩具字符串中的确切空间，并将其作为gsub中的模式提供。我希望在R。

请注意，我无法在玩具串中放置多个空格。谢谢

Answer 1

这似乎符合您的需求。

string <- "  Hi buddy   what's up   Bro "
library(stringr)
str_replace(gsub("\\s+", " ", str_trim(string)), "B", "b")
# [1] "Hi buddy what's up bro"

Answer 2

使用单个正则表达式的另一种方法：

gsub("(?<=[\\s])\\s*|^\\s+|\\s+$", "", string, perl=TRUE)

解释（from）

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    [\s]                     any character of: whitespace (\n, \r,
                             \t, \f, and " ")
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Answer 3

或者只需尝试squish

中的stringr功能即可

> library(stringr)
> string <- "  Hi buddy   what's up   Bro "
> str_squish(string)
[1] "Hi buddy what's up Bro"

Answer 4

qdapRegex具有处理此问题的rm_white函数：

library(qdapRegex)
rm_white(string)

## [1] "Hi buddy what's up Bro"

Answer 5

您无需导入外部库即可执行此任务：

string <- " Hi        buddy        what's up    Bro "
string <- gsub("\\s+", " ", string)
string <- trimws(string)
string
[1] "Hi buddy what's up Bro"

或者，一行：

string <- trimws(gsub("\\s+", " ", string))

更清洁。

Answer 6

您还可以尝试clean

中的qdap

library(qdap)
library(stringr)
str_trim(clean(string))
#[1] "Hi buddy what's up Bro"

或者@Tyler Rinker建议（仅使用qdap）

Trim(clean(string))
#[1] "Hi buddy what's up Bro"

Answer 7

为此，由于Base r软件包的gsub()可以完成工作，因此无需加载任何其他库。
无需记住那些额外的库。如@Adam Erickson所述，使用trimws()删除前导和尾随空白，并使用gsub()替换多余的空白。

    `string = " Hi        buddy        what's up    Bro "
     trimws(gsub("\\s+", " ", string))`

此处\\s+匹配一个或多个空格，gsub用单个空格替换。

要知道任何正则表达式在做什么，请按照@Tyler Rinker的描述访问link。
只需复制并粘贴您想知道它在做什么的正则表达式，其余this就会完成。

Answer 8

另一种使用strsplit的解决方案：

将文本拆分为单词，然后使用粘贴功能将单个单词连接起来。

string <- "Hi        buddy        what's up    Bro" 
stringsplit <- sapply(strsplit(string, " "), function(x){x[!x ==""]})
paste(stringsplit ,collapse = " ")

对于多个文档：

string <- c("Hi        buddy        what's up    Bro"," an  example using       strsplit ") 
stringsplit <- lapply(strsplit(string, " "), function(x){x[!x ==""]})
sapply(stringsplit ,function(d) paste(d,collapse = " "))

Answer 9

这似乎有效。
它不会消除句子开头或结尾的空格作为 Rich Scriven's 答案但是，它合并了多个白香料

library("stringr")
string <- "Hi     buddy     what's      up       Bro"
str_replace_all(string, "\\s+", " ")
#> str_replace_all(string, "\\s+", " ")
#  "Hi buddy what's up Bro"

将多个空格合并到单个空间;删除尾随/前导空格

9 个答案: