VBA RegEx第二次比赛从第一场比赛的第一索引开始

时间:2018-10-24 11:29:00

标签: regex excel vba

Good Day –我有一个例程,可以搜索成千上万个文本文件以捕获相关的文件名,从而剩下不到10,000个文件名。这些文件名为下一个例程打开所有剩余的文本文件以搜索特定数据提供了起点。每个文本文件可能包含50到50,000行之间的数据。例程循环遍历每个文件,首先找到一个序列号,然后找到所有出现的FAILED,然后它捕获该行文本开头附近的日期/时间戳,并填充另一个工作表。这对于发生各种故障都很有用。除了一个。

(请参见下面的代码示例中的关注区域的起点)我现在来到了我希望得到一些指导的地方。我需要验证一个FAILED实例,以确认PASSED是否在包含FAILED的行之后三行发生。如果“通过”在那里,它将总是是“失败”下方的第三行。我无法使用“失败”之后一定秒数内发生的日期/时间戳,例如“通过”,因为它变化太大,并且可能产生错误的结果。我认为捕获FAILED之后的第一个通过的最佳方法是捕获FAILED的FirstIndex位置,然后从那里开始搜索PASSED。但是,我不知道该怎么做或什至有可能。老实说,我不知道是否可以使用RegEx或VBA中我没有想到的方法来完成。我只是在学习RegEx,所以即使在阅读了所有内容之后,我在这一领域也非常虚弱。一个答案的解释将是最大的赞赏。 VBA我非常满意。我正在使用带有Microsoft VBScript正则表达式5.5的Excel 2010 Professional。

我认为这个答案可能会有所帮助,但是如果我不理解的话。 How to get the position of submatches in VBA?我们将不胜感激任何帮助或指导。预先谢谢你。

亲切的问候, 标记

样本经过清理的搜索数据

LOG:00 :: 01:11:03.129 [XXX _ ##] XXX:3390、3412、3401、3400、3401、3403、3402、3409 0090123101000172

LOG:00 :: 01:11:15.576 [XXX _ ###] XXX:3393、3399、3393、3395、3394、3396、3397、3395 0090123101000200

LOG:00 :: 01:11:23.568 [XXX _ ##] XXX:3390、3411、3401、3400、3401、3403、3402、3409 0090123101000173

LOG:00 :: 01:11:37.049 [XXX _ ###] XXX:3393、3400、3393、3394、3394、3396、3396、3395 0090123101000201

LOG:00 :: 01:11:53.265 [XXX _ ##] XXX:3388、3409、3399、3397、3399、3396、3400、3406 0090123101000129

LOG:00 :: 01:11:56.361 [XXX _ ###] XXX:3393、3399、3392、3394、3394、3396、3396、3395 0090123101000202

日志:00 :: 01:12:14.596 [XXX _ ##] XXXX Xxxxxxxxxxxxx Xxxxxxxxxxx 失败,Xxxxxxxxxxx:A:1、0090123101000130

LOG:00 :: 01:12:16.432 [XXX _ ##] XXXX ADC 3401、3402、3401、3399、3399、3401、3399、3401,

LOG:00 :: 01:12:16.502 [XXX _ ##] XXXX DAC 1477、1301、1405、1229、1406、1473、1770、1543,

日志:00 :: 01:12:16.581 [XXX _ ##] XXXX Xxxxxxxxxxxxx Xxxxxxxxxxx 已通过,Xxxxxxxxxxx:1

LOG:00 :: 01:12:16.846 [XXX _ ##] XXX:3407、3408、3406、3405、3405、3406、3404、3405 0090123101000130

LOG:00 :: 01:12:17.406 [XXX _ ###] XXX:3398、3403、3397、3400、3399、3401、3402、3399 0090123101000203

LOG:00 :: 01:12:37.508 [XXX _ ##] XXX:3402、3402、3400、3398、3400、3401、3400、3401 0090123101000131

LOG:00 :: 01:12:38.511 [XXX _ ###] XXX:3386、3393、3386、3386、3387、3387、3389、3389、3387 0090123101000204

LOG:00 :: 01:13:02.619 [XXX _ ##] XXX:3403、3402、3400、3397、3400、3401、3399、3401 0090123101000132

    Dim bFound              As Boolean          'Used to identify if sFile <> "".
    Dim dHr                 As Double    'Test  'Number of hours in dEndTime
    Dim dMin                As Double    'Test  'Number of minutes in dEndTime.
    Dim dSec                As Double    'Test  'Number of seconds in dEndTime.
    Dim dStartTime          As Double    'Test  'Time routine starts.
    Dim dEndTime            As Double    'Test  'Time routine completes.
    Dim i                   As Integer          'Array variable for rows.
    Dim iCurrentRow         As Integer          'Variable used in centering filename cells.
    Dim iNextRow            As Integer          'Used to find last row in column to add new data.
    Dim j                   As Integer          'Array variable for columns.
    Dim LastRow             As Integer          'Last row used by any column in current range.
    Dim NextRow             As Integer          'Last row of current column.
    Dim z                   As Integer          'Counter for files > 200 bytes.
    Dim lFileLen            As Long             'Length of text file.
    Dim oM                  As Object           'Single match.
    Dim oMtch               As Object           'Match collection.
    Dim oS                  As Object           'Number of matches found.
    Dim LastCol             As String           'Identify last column used.
    Dim LastColLetter       As String           'Last Column letter.
    Dim s1LastCol           As String           'Identify last column in Row 1 used.
    Dim s1LastColLetter     As String           'Last column in Row 1 letter.
    Dim sCurrCol            As String           'Numerical value of current column.
    Dim sCurrColLetter      As String           'Alphabetical value of current column.
    Dim sFile               As String           'File name to search in.
    Dim sFn                 As String           'Combined path and file to search in.
    Dim sPath               As String           'Path of file to search in.
    Dim sTxt                As String           'Variable to hold scripting.filesystemobject.
    Dim vArr                As Variant          'Array containing all finlenames.

'   Turn the following activity off to increase program speed.
    With Application
        .StatusBar = True
        .EnableEvents = False
        .DisplayAlerts = False
        .ScreenUpdating = False
    End With

    dStartTime = Now()                                   'For testing purposes ONLY.

    Sheets("Failures").Activate
    LastCol = ActiveSheet.UsedRange.SpecialCells(xlLastCell).Column
    If LastCol > 26 Then
        LastColLetter = Chr(Int((LastCol - 1) / 26) + 64) & Chr(((LastCol - 1) Mod 26) + 65)
    Else
        LastColLetter = Chr(LastCol + 64)
    End If

'   Get last row used by any column in current range.
    LastRow = ActiveSheet.UsedRange.Rows.Count

'   Set range values.
    vArr = Range("B1:" & LastColLetter & LastRow).Value
    Columns("B:" & LastColLetter).Delete Shift:=xlToLeft    'Delete previous data.

    sPath = "U:\Serial_Server_Data\"                        'Get path name.

    Sheets("Log Files").Activate

'   Will provide the last used column letter.
    LastCol = ActiveSheet.UsedRange.SpecialCells(xlLastCell).Column
    If LastCol > 26 Then
        LastColLetter = Chr(Int((LastCol - 1) / 26) + 64) & Chr(((LastCol - 1) Mod 26) + 65)
    Else
        LastColLetter = Chr(LastCol + 64)
    End If

'   Get last row used by any column in current range.
    LastRow = ActiveSheet.UsedRange.Rows.Count

'   Set range values.
    vArr = Range("C2:" & LastColLetter & LastRow).Value

'   Initialize variables.
    z = 1
    bFound = False

'   Step through files to apply Pattern to.
    For i = LBound(vArr, 1) To UBound(vArr, 1)          'Step through rows to apply Pattern to.
        For j = LBound(vArr, 2) To UBound(vArr, 2)      'Step through columns to apply Pattern to.

            If vArr(i, j) = "" Then GoTo SkipAll        'Skip cell if empty.

            sFile = vArr(i, j)                          'Get file name.
            lFileLen = GetDirOrFileSize(sPath, sFile)   'Get the file size for later use.

            If lFileLen > 200 Then          'Only search files that are over 200 bytes in length.
                Application.StatusBar = "Processing file " & z & " - " & sFile

'               Create full path with filename.
                sFn = sPath & sFile

'               Determine the next file number available for use by the FileOpen function
                sTxt = FreeFile

                sTxt = CreateObject("scripting.filesystemobject").OpenTextFile(sFn).ReadAll

'                i = 0
                With CreateObject("vbscript.regexp")    'Search for serial number.
                    .Global = False                     'Search for first instance.
                    .IgnoreCase = True                  'Select either upper or lowercase.
                    .Pattern = "Serial\sNo.\s\d{4}"
                    Set oMtch = .Execute(sTxt)
                    For Each oM In oMtch
                        For Each oS In .Execute(oM.Value)
'                            Debug.Print oS.Value

                            If oS <> vbNull Then    'Continue on only if serial number found.
                                Sheets("Failures").Activate
                                Range("A1").Activate
                                Do While ActiveCell.Value <> ""
                                    ActiveCell.Offset(0, 1).Activate

'                                   sFile already exists.
                                    If ActiveCell.Value = Right(oS.Value, 4) Then
                                        sCurrCol = ActiveCell.Column
                                        Do While ActiveCell.Value <> ""
                                            ActiveCell.Offset(1, 0).Activate
                                        Loop
                                        ActiveCell.Value = sFile

'                                       Get column letter from column number.
                                        If sCurrCol > 26 Then
                                            sCurrColLetter = Chr(Int((sCurrCol - 1) / 26) + 64) _
                                                & Chr(((sCurrCol - 1) Mod 26) + 65)
                                        Else
                                            sCurrColLetter = Chr(sCurrCol + 64)
                                        End If

'                                       Center cell.
                                        iCurrentRow = Application.WorksheetFunction.CountA(Range _
                                            (sCurrColLetter & ":" & sCurrColLetter))
                                        Range(sCurrColLetter & iCurrentRow).HorizontalAlignment _
                                            = xlCenter

'                                       Adjust the column to fit file name.
                                        Columns(sCurrColLetter & ":" & _
                                            sCurrColLetter).ColumnWidth = 35
                                        bFound = True
                                        z = z + 1
                                        Exit Do
                                    End If
                                Loop

'                               sFile doesn't exist.
                                If ActiveCell.Value = "" And bFound = False Then
                                    ActiveCell.Value = Right(oS.Value, 4)
                                    ActiveCell.Offset(1, 0).Value = sFile
                                    sCurrCol = ActiveCell.Column

'                                   Get column letter from column number.
                                    If sCurrCol > 26 Then
                                        sCurrColLetter = Chr(Int((sCurrCol - 1) / 26) + 64) _
                                            & Chr(((sCurrCol - 1) Mod 26) + 65)
                                    Else
                                        sCurrColLetter = Chr(sCurrCol + 64)
                                    End If

'                                   Center cell.
                                    iCurrentRow = Application.WorksheetFunction.CountA(Range _
                                        (sCurrColLetter & ":" & sCurrColLetter))
                                    Range(sCurrColLetter & iCurrentRow).HorizontalAlignment _
                                        = xlCenter

'                                   Adjust the column to fit file name.
                                    Columns(sCurrColLetter & ":" & sCurrColLetter).ColumnWidth _
                                        = 35
                                    z = z + 1
                                End If
                            End If
                        Next
                    Next

'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>  Beginning of area of concern.
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

                    .Global = True                      'Search for instance.
                    .IgnoreCase = True                  'Select either upper or lowercase.

'                   Search for 'failed' with any amount of text on either side.
                    .Pattern = ".*failed.*"
                    Set oMtch = .Execute(sTxt)
                    For Each oM In oMtch
                        For Each oS In .Execute(oM.Value)
                            iNextRow = Application.WorksheetFunction.CountA(Range _
                                (sCurrColLetter & ":" & sCurrColLetter)) + 1
                            If Left(oS.Value, 4) = "LOG:" Then

'                               Ignore FLR-x PeakDetector Dash failure.
                                If UCase(Mid(oS.Value, 32, 3)) <> "FLR" Then

'                                   Print all other "Failed" occurances.
                                    Range(sCurrColLetter & iNextRow).Activate
                                    ActiveCell.Value = Mid(oS.Value, 6, 16)
                                End If
                            End If
                            If Mid(oS.Value, 4, 4) = "LOG:" Then
                                Range(sCurrColLetter & iNextRow).Activate
                                ActiveCell.Value = Mid(oS.Value, 9, 16)
                            End If
                        Next
                    Next
                End With
            End If

'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>  End of area of concern.
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
'   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

SkipAll:
            bFound = False
            Sheets("Log Files").Activate
        Next j
    Next i

'   Cleanup: Add borders, heading background fill, remove gridlines.

1 个答案:

答案 0 :(得分:2)

这是对您问题的间接答案:冒着引发大战的危险,我不喜欢regex。我以前使用过它,主要是在Perl中的bash脚本中使用的,但是(几乎)在开发文本解析代码时总是能够解决它。以下是如何解决您的特定问题的示例。显然,采用我的方法会导致代码重构。请考虑将其作为替代方案。

我的方法将日志文件提取到VBA Collection中,其中集合中的每个Item都是单独的一行。我使用集合而不是String()数组,因为该集合很容易扩展为未知数量的行,而必须对数组进行ReDim编辑,并且要先知道确切的行数(可能导致两次循环,两次读取同一文件。

Private Function GetFileByLines(ByVal filePath As String) As Collection
    Dim fso As FileSystemObject
    Set fso = New FileSystemObject

    Dim txtStream As Object
    Set txtStream = fso.OpenTextFile(filePath, ForReading, False)

    Dim lines As Collection
    Set lines = New Collection

    Do While Not txtStream.AtEndOfStream
        Dim line As String
        lines.Add txtStream.ReadLine
    Loop
    txtStream.Close
    Set GetFileByLines = lines
End Function

一旦您从文本文件中获得了行的集合,就可以轻松地通过索引进行循环,以检查“通过”行之后是否存在三行“通过”。

Private Sub ScanInputFile(ByVal filename As String)
    Dim fileLines As Collection
    Set fileLines = GetFileByLines(filename)

    Dim i As Long
    For i = 1 To fileLines.Count
        If LCase(fileLines(i)) Like "*failed*" Then
            '--- check to make sure we're not near the end of the file
            If i + 3 < fileLines.Count Then
                If LCase(fileLines(i + 3)) Like "*passed*" Then
                    Debug.Print "found a PASSED line three lines after FAILED, " & _
                                "lines " & i & " and " & i + 3
                End If
            End If
        End If
    Next i
End Sub

很显然,这不能直接集成到您的整个解决方案中,因为我没有解决您已经编码和调试的部分。不过,作为快速的代码回顾,请阅读avoiding the use of Activate and Select

这是我使用的整个测试模块。 “ testlog.txt”文件是您上面数据中的直接副本。

Option Explicit

Public Sub test()
    ScanInputFile "C:\Temp\testlog.txt"
End Sub

Private Sub ScanInputFile(ByVal filename As String)
    Dim fileLines As Collection
    Set fileLines = GetFileByLines(filename)

    Dim i As Long
    For i = 1 To fileLines.Count
        If LCase(fileLines(i)) Like "*failed*" Then
            '--- check to make sure we're not near the end of the file
            If i + 3 < fileLines.Count Then
                If LCase(fileLines(i + 3)) Like "*passed*" Then
                    Debug.Print "found a PASSED line three lines after FAILED, " & _
                                "lines " & i & " and " & i + 3
                End If
            End If
        End If
    Next i
End Sub

Private Function GetFileByLines(ByVal filePath As String) As Collection
    Dim fso As FileSystemObject
    Set fso = New FileSystemObject

    Dim txtStream As Object
    Set txtStream = fso.OpenTextFile(filePath, ForReading, False)

    Dim lines As Collection
    Set lines = New Collection

    Do While Not txtStream.AtEndOfStream
        Dim line As String
        lines.Add txtStream.ReadLine
    Loop
    txtStream.Close
    Set GetFileByLines = lines
End Function