如果数值不包含/或%,则在每个数值上拆分一个字符串

时间:2015-04-10 15:56:06

标签: regex vb.net

我有以下字符串:

XXX,XXX abc this is a test, X.X% first one, X,XXX,XXX def one X/X more test, XX,XXX,XXX last test

是X是一个数字。我需要将该字符串拆分为3个部分:

我从来不知道任何事情,除了每个部分都以数字开头。 我通常使用Regex这样的东西,我想知道是否有人可以帮我用正则表达式来分割这个字符串

XXX,XXX abc this is a test, X.X% first one
X,XXX,XXX def one more X/X test
XX,XXX,XXX last test

我尝试过使用Regex.Split(str,"(?=\b\d+\b)")

但是它会分割一个数字,这不是我想要的。它让我:

XXX,
XXX abc this is a test, 
X.
X% first one
X,
XXX,
XXX def one more 
X/
X test
XX,
XXX,
XXX last test

2 个答案:

答案 0 :(得分:0)

您可以使用:\s+(?=\d+(?:,\d+)+|(?:\d+\s+)|(?:\d+$))

完全适用于您的案例,请参阅DEMO

答案 1 :(得分:0)

毫无疑问,这是一个更优雅的解决方案,但这是一个正常工作的VB控制台应用程序,完全按照您的说法,将输入字符串拆分为您描述的三个字符串。

Imports System.Text.RegularExpressions

Module Module1

Sub Main()
    Dim input As String = "123,456 abc this is a test, 7.8% first one, 9,012,345 def one 6/7 more test, 89,012,345 last test"

    'Split into component strings, with each string starting with a number (no percents or fractions)
    Dim splitStrings As List(Of String) = SplitOnNumbers(input)

    'Write out each; one line per string
    For Each component As String In splitStrings
        Console.WriteLine(component)
    Next
End Sub

Private Function SplitOnNumbers(input As String) As List(Of String)
    Dim splitStrings As New List(Of String)()

    'Pattern to match all digits not preceded by a "/" and not eventually ending in a "/" or "%"
    Const pattern As String = "(?<!\/)(?:\d+[\,,\.]*)+(?!(%|\/|\.|\,|\d)+)"

    'Get the matching numbers...
    Dim matches As MatchCollection = Regex.Matches(input, pattern)
    '...and the text inbetween the matching numbers
    Dim inbetweenText As String() = Regex.Split(input, pattern)

    'Add the first text element that precedes a number, if any.
    If inbetweenText IsNot Nothing AndAlso inbetweenText(0).Length > 0 Then
        splitStrings.Add(inbetweenText(0))
    End If

    'Combine matching numbers with the text after it
    For i As Integer = 0 To matches.Count - 1
        splitStrings.Add(matches(i).Value + inbetweenText(i + 1))
    Next

    Return splitStrings
End Function
End Module

输出如下:

123,456 abc this is a test, 7.8% first one,
9,012,345 def one 6/7 more test,
89,012,345 last test

Regex.Match()将获取像&#34; 123,456&#34;这样的数字,而Regex.Split()将获取其间的所有内容。我手动将它们捣碎在一起。

正则表达式并不漂亮,但它分为三个主要部分:

  1. (?<!\/)向后看,以确保数字不是斜线。所以它排除了分数的后半部分。
  2. (?:\d+[\,,\.]*)+匹配数字,其中包含任何逗号和句点
  3. (?!(%|\/|\.|\,|\d)+)确保该号码最终不会以&#34;%&#34;或者&#34; /&#34;。我也必须包括其他字符(数字,句号,逗号),否则你会得到#34; 7的匹配。&#34;在&#34; 7.8%&#34;。