如何有效地处理错误以防止误导结果?

时间:2017-12-01 21:01:17

标签: vba excel-vba error-handling web-scraping excel

我在vba中编写了一些代码,以便在某些网站中找到某些名称的某些身份。如果所有内容都正确,那么代码运行良好,我的意思是如果链接有效,名称与a标签匹配,最后正则表达式可以找到标识。如果三个或所有三个中的任何一个都是错误的搜索,则脚本会抛出错误。我已经在下面的脚本中指定了发生错误的位置。

我希望专家为我提供任何解决方案,以便我能够处理错误并让我的脚本继续运行,直到所有链接都用完为止。

由于我对VBA知之甚少,所以我尝试使用On error resume next来跳过错误。然而,当我看一下结果时,结果却是一个明显的混乱。我正在粘贴一个粗略的例子,当我使用On error resume next时,我得到了什么。

Sub Identity_Finder()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim post As Object, link As Variant, refined_links As String
    Dim rxp As New RegExp, identity As Object

    For Each link In [{"http://spltech.in/","http://www.unifrostindia.com/","http://www.unitfrostindia.com/","http://www.greenplanet.in/"}]
        With http
            .Open "GET", link, False
            .send                    '''throws here the first error if the link is invalid
            html.body.innerHTML = .responseText
        End With

        For Each post In html.getElementsByTagName("a")
            If InStr(post.innerText, "certain_name") > 0 Then refined_links = post.href: Exit For
        Next post

        With http
            .Open "GET", refined_links, False
            .send                          ''throws another error here if no such link is found
        End With

        With rxp
            .Pattern = "some_regex"
            .Global = True
            Set identity = .Execute(http.responseText) 
        End With

        r = r + 1: Cells(r, 1) = link
        Cells(r, 2) = identity(0)    ''''throws another error here if no such identity is noticed

    Next link
End Sub

使用On error resume next我得到了什么:

John executive
Mac lawyer
lulu lawyer
Robin lawyer 
Cathy student

预期产出:

John executive
Mac lawyer
lulu 
Robin  
Cathy student

当我使用On error resume next时,空字段(当找不到它们时)会填充先前的值。我怎样才能解决这个误导性的结果?提前谢谢。

1 个答案:

答案 0 :(得分:0)

VBA中error trap的最有效方式是

1)实际测试输入/结果,然后通过定制函数或内置编码概念或两者兼而有之。

2)如果绝对需要,请使用VBA内置错误处理

示例1

例如。您可以使用自定义函数包装此语句,以测试URL是否有效。

With http
    .Open "GET", link, False
    .send                    '''throws here the first error if the link is invalid
    html.body.innerHTML = .responseText
End With

If ValidURL Then 

    With http
        .Open "GET", link, False
        .send     
         html.body.innerHTML = .responseText
    End With

End If

其中ValidURL是定义为:

的函数
Function ValidURL(URL as String) as Boolean

     Dim result as Boolean
    'I don't know how you would specify a valid link in your specific case
    'but that code goes here
    'a dummy example follows
     result = Left(URL,7) = "http://"
     ValidURL = result 'True or False

End Function

示例2

我在这个陈述中假设:

    With http
        .Open "GET", refined_links, False
        .send                          ''throws another error here if no such link is found
    End With

当找不到这样的链接时会产生特定的错误号(代码)。发现该号码并使用此代码绕过。

With http
    .Open "GET", refined_links, False
    On Error Resume Next
    .Send
    On Error GoTo 0
End With

If err.Number <> 9999 'replace with correct number

    'continue with regex test

End If

全力以赴

最后将所有内容放在一起,您可以这样构建,只需最少使用On Error Resume NextGoTo语句。

For Each link In [{"http://spltech.in/","http://www.unifrostindia.com/","http://www.unitfrostindia.com/","http://www.greenplanet.in/"}]

    If ValidURL(link) Then 

        With http
            .Open "GET", link, False
            .send     
             html.body.innerHTML = .responseText
        End With

        For Each post In html.getElementsByTagName("a")
            If InStr(post.innerText, "certain_name") > 0 Then refined_links = post.href: Exit For
        Next post

        With http
            .Open "GET", refined_links, False
            On Error Resume Next
            .Send
            On Error GoTo 0
        End With

        If err.Number <> 9999 'replace with correct number

            With rxp
                .Pattern = "some_regex"
                .Global = True
                Set identity = .Execute(http.responseText) 
            End With

            'i will leave it to you on how to account for no pattern match
            r = r + 1: Cells(r, 1) = link
            Cells(r, 2) = identity(0)    ''''throws another error here if no such identity is noticed

        End If

    End If

Next link