如何从特定字符串中删除多余的空格?

时间:2015-11-22 11:28:14

标签: regex vb.net string replace tags

我有一个类似下面的字符串:

Ireland, UK, United States of America,     Belgium, Germany   , Some     Country, ...

我需要RegexString.Replace函数的帮助来删除多余的空格,以便结果如下:

Ireland,UK,United States of America,Belgium,Germany,Same Country,

谢谢。

2 个答案:

答案 0 :(得分:4)

你可以通过用逗号分割输入,然后修剪和缩小多个空格为1,然后String.Join来实现。

只是展示如何使用LINQ完成:

Console.Write(String.Join(",", _
    "Ireland, UK, United States of America,     Belgium, Germany   , Some     Country," _
     .Split(","c) _
     .Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
     .ToArray()))

关键是Regex.Replace(m.Trim(), "\p{Zs}{2,}", " "),其中多个空格缩小为1。

结果:Ireland,UK,United States of America,Belgium,Germany,Some Country,

答案 1 :(得分:2)

尽管stribizhev撰写的答案对于这种情况很有帮助,但我希望借此机会强调对使用正则表达式执行简单任务所带来的(负面)性能影响。

ALTERNATIVE NOTABLY FANTER(x2)比REGEX(在处理这些情况时总是很慢)

我的方法是基于递归删除空格。我创建了两个版本:第一个版本带有传统循环(withoutRegex),第二个版本依赖于LINQ(withoutRegex2;实际上,除了Regex之外,它与stribizhev的答案相同一部分)。

Private Function withoutRegex(input As String) As String

    Dim output As String = ""

    Dim temp() = input.Split(","c)
    For i As Integer = 0 To temp.Length - 1
        output = output & recursiveSpaceRemoval(temp(i).Trim()) & If(i < temp.Length - 1, ",", "")
    Next

    Return output

End Function

Private Function withoutRegex2(input As String) As String

    Return String.Join(",", _
    input _
    .Split(","c) _
    .Select(Function(x) recursiveSpaceRemoval(x.Trim())) _
    .ToArray())

End Function

Private Function recursiveSpaceRemoval(input As String) As String

    Dim output As String = input.Replace("  ", " ")

    If output = input Then Return output
    Return recursiveSpaceRemoval(output)

End Function

为证明我的观点,我创建了以下测试框架:

Dim input As String = "Ireland, UK, United States of America,     Belgium, Germany   , Some     Country"
Dim output As String = ""

Dim count As Integer = 0
Dim countMax As Integer = 20
Dim with0 As Long = 0
Dim without As Long = 0
Dim without2 As Long = 0

While count < countMax

    count = count + 1
    Dim sw As Stopwatch = New Stopwatch
    sw.Start()
    output = withRegex(input)
    sw.Stop()
    with0 = with0 + sw.ElapsedTicks

    sw = New Stopwatch
    sw.Start()
    output = withoutRegex(input)
    sw.Stop()
    without = without + sw.ElapsedTicks

    sw = New Stopwatch
    sw.Start()
    output = withoutRegex2(input)
    sw.Stop()
    without2 = without2 + sw.ElapsedTicks

End While

MessageBox.Show("With: " & with0.ToString)
MessageBox.Show("Without: " & without.ToString)
MessageBox.Show("Without 2: " & without2.ToString)

withRegex指的是stribizhev的答案,即:

Private Function withRegex(input As String) As String

    Return String.Join(",", _
    input _
    .Split(","c) _
    .Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
    .ToArray())

End Function

这是一个简单的测试框架,可以分析非常快速的操作,每个位都很重要(20次循环迭代的重点是精确地尝试提高测量的可靠性)。即:即使改变调用方法的顺序,结果也会受到影响。

在任何情况下,方法之间的差异在我的所有测试中都或多或少保持一致。我在一些测试后得到的平均值是:

With: 2500-2700
Without: 1100-1300
Without2: 900-1200

注意:就这是对正则表达式的表现的一般评论(至少在简单的情况下,可能很容易被我在这里展示的替代品替换),任何关于如何改进它的建议(在.NET中的正则表达式将非常受欢迎。但是,请避免使用通用的不明确的陈述并尽可能具体(例如,通过建议对测试框架进行修改)。