visual basic正则表达式无限循环

时间:2017-02-28 08:42:55

标签: regex vb.net

语言= Visual Basic。 我有一个使用.Net framework 4的项目

我有正则表达式的代码:

Private Shared RegPattern As New Regex("\<base+.+?href\s*\=\s*(""(?<HREF>[^""]*)""|'(?<HREF>[^']*)')(\s*\w*\s*\=\s*(""[^""]*""|'[^']*')|[^>])*(\/>|>\<\/base\>)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)

我有这个函数从html页面获取链接:

Private Sub GetAdress(ByVal HtmlPage As String)
            Base = ""
            Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

            For Each V_Found As System.Text.RegularExpressions.Match In Matches
                Base = V_Found.Groups("HREF").Value           
End Sub

该函数工作正常但在某些情况下进入无限循环。 调试器说&#34;评估超时&#34;在这一行:

Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

并且exe不会继续或退出或捕获异常。 我该如何处理这个问题? 我如何退出GetAddress方法? 我知道有时间,但在网络4中我无法使用它。

1 个答案:

答案 0 :(得分:0)

如果你想保留代码,但是抓住异常使它什么都不做,试试Try ... Catch。

Try
    Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
    Base = ""

    For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
Catch TimeOutException
End Try

由于看起来你只是想解析链接,你可以尝试类似:

Dim htmlBrowser As WebBrowser = 'Browser with HtmlPage'
Dim linkCollection As HtmlElementCollection = htmlBrowser.Document.GetElementsByTagname("a") 'Or another tag name

For Each elems As HtmlElement In linkCollection
    Base = ""
    Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

    For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
        'Code to run'
    Next
Next