语言= Visual Basic。 我有一个使用.Net framework 4的项目
我有正则表达式的代码:
Private Shared RegPattern As New Regex("\<base+.+?href\s*\=\s*(""(?<HREF>[^""]*)""|'(?<HREF>[^']*)')(\s*\w*\s*\=\s*(""[^""]*""|'[^']*')|[^>])*(\/>|>\<\/base\>)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
我有这个函数从html页面获取链接:
Private Sub GetAdress(ByVal HtmlPage As String)
Base = ""
Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
For Each V_Found As System.Text.RegularExpressions.Match In Matches
Base = V_Found.Groups("HREF").Value
End Sub
该函数工作正常但在某些情况下进入无限循环。 调试器说&#34;评估超时&#34;在这一行:
Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
并且exe不会继续或退出或捕获异常。 我该如何处理这个问题? 我如何退出GetAddress方法? 我知道有时间,但在网络4中我无法使用它。
答案 0 :(得分:0)
如果你想保留代码,但是抓住异常使它什么都不做,试试Try ... Catch。
Try
Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
Base = ""
For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
Catch TimeOutException
End Try
由于看起来你只是想解析链接,你可以尝试类似:
Dim htmlBrowser As WebBrowser = 'Browser with HtmlPage'
Dim linkCollection As HtmlElementCollection = htmlBrowser.Document.GetElementsByTagname("a") 'Or another tag name
For Each elems As HtmlElement In linkCollection
Base = ""
Dim Matches As System.Text.RegularExpressions.MatchCollection = RegPattern.Matches(HtmlPage)
For Each V_Found As System.Text.RegularExpressions.Match In Matches Base = V_Found.Groups("HREF").Value
'Code to run'
Next
Next