Question

语言= Visual Basic。我有一个使用.Net framework 4的项目

我有正则表达式的代码：

Private Shared RegPattern As New Regex("\<base+.+?href\s*\=\s*(""(?<HREF>[^""]*)""|'(?<HREF>[^']*)')(\s*\w*\s*\=\s*(""[^""]*""|'[^']*')|[^>])*(\/>|>\<\/base\>)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)

我有这个函数从html页面获取链接：

Private Sub GetAdress(ByVal HtmlPage As String)
            Base = ""
            Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

            For Each V_Found As System.Text.RegularExpressions.Match In Matches
                Base = V_Found.Groups("HREF").Value           
End Sub

该函数工作正常但在某些情况下进入无限循环。调试器说＆＃34;评估超时＆＃34;在这一行：

Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

并且exe不会继续或退出或捕获异常。我该如何处理这个问题？我如何退出GetAddress方法？我知道有时间，但在网络4中我无法使用它。

Answer 1

如果你想保留代码，但是抓住异常使它什么都不做，试试Try ... Catch。

Try
    Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
    Base = ""

    For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
Catch TimeOutException
End Try

由于看起来你只是想解析链接，你可以尝试类似：

Dim htmlBrowser As WebBrowser = 'Browser with HtmlPage'
Dim linkCollection As HtmlElementCollection = htmlBrowser.Document.GetElementsByTagname("a") 'Or another tag name

For Each elems As HtmlElement In linkCollection
    Base = ""
    Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)

    For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
        'Code to run'
    Next
Next

visual basic正则表达式无限循环

1 个答案: