使用正则表达式从一行中拉出多个匹配项

时间:2012-07-30 18:11:42

标签: regex vb.net replace split matching

我似乎进入了一个相当复杂的区域(反正对我来说。) 可以说我有以下几行:

1:11:39 "LOGIN ATTEMPT: "47576966" Arlond"

我要做的是分开时间(1:11:39)ID(47576966)和名字(Arlond)。我得到了下面的正则表达式,但我有点迷失在接下来需要做的事情上。我理解我的正则表达式是不正确的,抓住我需要的一切,这也是我需要帮助,让我的For循环正常工作。我一直在寻找如何正则表达式分裂和替换,但到目前为止,我还没有运气好运。

([""'])(?:(?=(\\?))\2.)*?\1


Using TestFile As New IO.StreamReader(My.Settings.cfgPath & "tempRPT.txt", System.Text.Encoding.Default, False, 4096)
        Do Until TestFile.EndOfStream
            ScriptLine = TestFile.ReadLine
            ScriptLine = LCase(ScriptLine)
            If InStr(ScriptLine, "login attempt:") Then
                Dim m As MatchCollection = Regex.Matches(ScriptLine, "([""'])(?:(?=(\\?))\2.)*?\1")
                For Each x As Match In m

                Next
                'builder.AppendLine(ScriptLine)
            End If

        Loop
    End Using

2 个答案:

答案 0 :(得分:1)

关于你的正则表达式,我总是发现最好尽可能明确(例如,锚点)。假设您的输入数据与其外观一样,您可以执行以下操作:

^(\d{1,2}:\d{2}:\d{2})\s""LOGIN\sATTEMPT:\s""(\d+)""\s([^""]+)""$

将其分解为其组成部分:

^                       // Anchor: Start of string (or line).
(\d{1,2}:\d{2}:\d{2})   // Capture one or two digits, colon, two digits, colon, two digits.
\s""LOGIN\sATTEMPT:\s"" // Anchor: match (but don't capture) literal text.
(\d+)                   // Match/capture one or more digits. (maybe you could use \d{8} instead).
""\s                    // Anchor: literal text.
([^""]+)                // Match and capture everything that is not a quote.
""                      // Anchor: Literal quote.
$                       // Anchor: End of string (or line).

如果允许name字段包含"(双引号)字符,则会中断。如果事实证明是这种情况,则必须将最后一个子表达式修改为更宽松。

答案 1 :(得分:1)

在回答DavidO接受的问题时,我只想表明我已将其分解以更好地理解它。

If InStr(ScriptLine, "login attempt:") Then
                Dim m As Match = Regex.Match(ScriptLine, ("(\d{1,2}:\d{2}:\d{2})"))
                hurrburr = m.Value
                'Regex.Replace(ScriptLine, "(\d{1,2}:\d{2}:\d{2})", "")
                Dim mm As Match = Regex.Match(ScriptLine, "(\d{7,8})")
                'ScriptLine = ScriptLine & " " & mm.Value
                hurrburr = hurrburr & " " & mm.Value
                Dim mmm As Match = Regex.Match(ScriptLine, """\s([^""]+)")
                temp = mmm.Value.Replace("""", "")
                hurrburr = hurrburr & " " & temp
                builder.AppendLine(hurrburr)
 End If