VBA正则表达式提取托架之间的文本,没有括号

时间:2016-10-28 14:10:45

标签: regex excel vba excel-vba

在句子中:

" [x] Alpha

[33] Beta"

我成功地将一组括号数据提取为([x],[33]) 使用VBA正则表达式模式:

"(\[x\])|(\[\d*\])"

但是我不能直接提取UN-bracketed数据的数组为(x,33) 甚至在使用了什么web ressources建议模式

之后
"(?<=\[)(.*?)(?=\])"

这是否是VBA特定问题(即对其Regex实施的限制) 或者我是否理解&#39;向前看和向后看&#39;模式错误? 我的代码是:

Public Function Regx( _
ByVal SourceString As String, _
ByVal Pattern As String, _
Optional ByVal IgnoreCase As Boolean = True, _
Optional ByVal MultiLine As Boolean = True, _
Optional ByVal MatchGlobal As Boolean = True) _
As Variant

Dim oMatch As Match
Dim arrMatches
Dim lngCount As Long

' Initialize to an empty array
arrMatches = Array()
With New RegExp
    .MultiLine = MultiLine
    .IgnoreCase = IgnoreCase
    .Global = MatchGlobal
    .Pattern = Pattern
    For Each oMatch In .Execute(SourceString)
        ReDim Preserve arrMatches(lngCount)
        arrMatches(lngCount) = oMatch.Value
        lngCount = lngCount + 1
    Next
End With

Sub testabove()

Call Regx("[x] Alpha" & Chr(13) & _
"[33] Beta", "(\[x\])|(\[\d*\])")
End Sub

4 个答案:

答案 0 :(得分:3)

在子图案周围使用捕捉来获取所需的值。

使用

"\[(x)\]|\[(\d*)\]"

(或\d+如果您需要匹配至少1位数,则*表示零次或多次出现+表示 1或更多事件)。

然后,通过检查子匹配长度来访问正确的Submatches索引(因为您有一个替换,任何一个子匹配都将为空),然后就可以了。只需使用

更改for循环即可
For Each oMatch In .Execute(SourceString)
    ReDim Preserve arrMatches(lngCount)
    If Len(oMatch.SubMatches(0)) > 0 Then
        arrMatches(lngCount) = oMatch.SubMatches(0)
    Else
        arrMatches(lngCount) = oMatch.SubMatches(1)
    End If
    ' Debug.Print arrMatches(lngCount) ' - This outputs x and 33 with your data
    lngCount = lngCount + 1
Next

答案 1 :(得分:1)

试试这个:

\[(x)\]|\[(\d*)\]

你想要被捕获的东西,不要把它们放在里面()。这用于分组

Explanation

You will get x and 33 in $1 and $2

Dot Net Sample

好吧,我为你做好了准备,虽然远离vb很长时间。可能不需要很多,但它可能会帮助你更好地理解它

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim text As String = "[x] Alpha      [33] Beta]"
      Dim pattern As String = "\[(x)\]|\[(\d*)\]"

      ' Instantiate the regular expression object.
      Dim r As Regex = new Regex(pattern, RegexOptions.IgnoreCase)

      ' Match the regular expression pattern against a text string.
      Dim m As Match = r.Match(text)
      Dim matchcount as Integer = 0
      Do While m.Success
         matchCount += 1
         Console.WriteLine("Match" & (matchCount))
         Dim i As Integer
         For i = 1 to 2
            Dim g as Group = m.Groups(i)
            Console.WriteLine("Group" & i & "='" & g.ToString() & "'")
            Dim cc As CaptureCollection = g.Captures
            Dim j As Integer 
            For j = 0 to cc.Count - 1
              Dim c As Capture = cc(j)
               Console.WriteLine("Capture" & j & "='" & c.ToString() _
                  & "', Position=" & c.Index)
            Next 
         Next 
         m = m.NextMatch()
      Loop
   End Sub
End Module

答案 2 :(得分:1)

使用Excel和VBA,您可以在正则表达式提取后删除括号

Sub qwerty()

    Dim inpt As String, outpt As String
    Dim MColl As MatchCollection, temp2 As String
    Dim regex As RegExp, L As Long

    inpt = "38c6v5hrk[x]537fhvvb"

    Set regex = New RegExp
    regex.Pattern = "(\[x\])|(\[\d*\])"
    Set MColl = regex.Execute(inpt)
    temp2 = MColl(0).Value

    L = Len(temp2) - 2
    outpt = Mid(temp2, 2, L)

    MsgBox inpt & vbCrLf & outpt
End Sub

enter image description here

答案 3 :(得分:1)

没有正则表达式的数组:

For Each Value In Split(SourceString, Chr(13))
  ReDim Preserve arrMatches(lngCount)
  arrMatches(lngCount) = Split(Split(Value, "]")(0), "[")(1)
  lngCount = lngCount + 1
Next