正则表达式捕获多个不同的字符串模式

时间:2017-05-15 19:29:15

标签: regex excel-vba vba excel

我正在尝试捕获文本中的所有产品代码。它们有时是多重的。我只能捕获一个。我想它更多的是excel vba问题,而不是正则表达式,因为我有str模式。

 Sub regexp()

Dim regEx As New regexp
Dim strPattern As String
Dim Myrange As Range
Dim LastRow As Integer


LastRow = ActiveSheet.Cells(Rows.Count, "W").End(xlUp).Row
Set Myrange = ActiveSheet.Range("W4:W" & LastRow)

For Each c In Myrange


strPattern = "(?:\s[ABCDabcd][0-9][A-Za-z0-9]{3}\s|\s[ABCDabcd][0-9oO][0-9oO)][0-9][0-9A-Za-z]\s|\s[ABCDabcd][0-9oO][0-9A-Za-z)][0-9A-Za-z][0-9]\s|[^A-Za-z0-9][ABCDabcd]\s[0-9][A-Z0-9a-z]{3}\s)"
    If strPattern <> "" Then
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        Set matches = regEx.Execute(c.Value)
        On Error Resume Next
        c.Offset(0, 7).Value = matches.Item(0)
    End If
Next c
End Sub

如何更改代码以捕获文本中不同的多个字符串模式。此代码目前仅捕获第一个。我想捕获所有用逗号分隔的内容。 例如 - 首先我遇到了产品b0067的问题。然后问题得到解决。后来我遇到了这个D0887的问题。其次是C689W,D 7890 OUTPUT-b0067,D0887,C689W,D 7890 我希望修剪这些,因为在结束或开始时考虑了空格。

1 个答案:

答案 0 :(得分:0)

首先,我建议将正则表达式增强到

strPattern = "(?:\s[abcd](?:[0-9][a-z0-9]{3}|[0-9o][0-9o)][0-9][0-9a-z]|[0-9o][0-9a-z)][0-9a-z][0-9])|[^a-z0-9][abcd]\s[0-9][0-9a-z]{3})(?!\S)"

请参阅regex demo确保设置.IgnoreCase = True 。在与(?!\S)前瞻匹配后,此正则表达式将检查是否存在空格或字符串结尾,因此A0ABCB0ABC将从"1 A0ABC B0ABC 3"中提取。第一个\s可以更改为(^|\s),以匹配字符串开头的字符串。

然后,在运行Execute()之后,迭代匹配:

 Set matches = regEx.Execute(c.Value)
 For Each m In matches
    Cells(x, y) = m.Value
 Next

这是一个完整的固定sub,它将匹配项打印为您指定的单元格的逗号分隔值:

Dim regEx As New regexp
Dim strPattern As String
Dim Myrange As Range
Dim LastRow As Long, cnt As Long

LastRow = ActiveSheet.Cells(Rows.Count, "W").End(xlUp).Row
Set Myrange = ActiveSheet.Range("W4:W" & LastRow)
strPattern = "(?:\s[abcd](?:[0-9][a-z0-9]{3}|[0-9o][0-9o)][0-9][0-9a-z]|[0-9o][0-9a-z)][0-9a-z][0-9])|[^a-z0-9][abcd]\s[0-9][0-9a-z]{3})(?!\S)"

If strPattern <> "" Then
  With regEx
    .Global = True
    .IgnoreCase = True
    .Pattern = strPattern
  End With
  cnt = 0
  For Each c In Myrange
    Set matches = regEx.Execute(c.Value)
    For Each m In matches
        c.Offset(0, 7).Value = c.offset(0, 7).Value + m.Value
        cnt = cnt + 1
        If cnt < matches.Count Then c.offset(0, 7) = c.offset(0, 7) & ","
    Next
  Next c
End If

注意RegExp初始化的位置 - 在循环之外。