VBA Internet Explorer应用程序为每个函数调用提供不同的结果

时间:2016-09-30 17:03:51

标签: vba excel-vba web-scraping excel

我尝试在excel中自动执行需要打开网页,导航到该页面上的链接,然后单击第二页上的按钮以下载.xlsx文件的任务。

我写了一个应该这样做的脚本。但是,我从网页得到的回复并不总是一样的。特别是,有时这将从第一页返回下载,有时它将导航到第二页而不是下载任何内容,一次或两次都完成。

我的感觉是,这与InternetExplorer.application完成请求所需的时间有关。我无法弄清楚如何对此进行故障排除,因为我告诉脚本等待IE.application完成其请求。

Sub DoBrowse2()

    'For Each lnk In Sheets("Sheet4").Hyperlinks
        'Range(lnk).Hy.Follow
        'Next

    Dim i As Long
    Dim URL As String
    Dim BaseURL As String
    Dim ToURL As String
    Dim IE As Object
    Dim objElement As Object
    Dim objCollection As Object
    Dim HWNDSrc As Long
    Dim html As IHTMLDocument

    Set IE = CreateObject("InternetExplorer.Application")

    URL = Range("B2").Hyperlinks(1).Address

    IE.Navigate URL

    IE.Visible = True

    Application.StatusBar = URL & " is loading. Please wait..."

    Do While IE.ReadyState = 4: DoEvents: Loop
    Do Until IE.ReadyState = 4: DoEvents: Loop

    Application.StatusBar = URL & " Loaded"

    'Set html = IE.Document
    'Dim elements As IHTMLElementCollection
    'Set elements = html.all

    For Each itm In IE.Document.all
        If itm.className = "datagrid" Then
            For Each el In itm.Document.all
                Debug.Print "hello"
                If el.className = "ujump" And Right(el.innerText, 12) = "Constituents" Then
                    'Debug.Print el.innerText
                    ToURL = el.getAttribute("data-subset")
                    BaseURL = "http://datastream.thomsonreuters.com/navigator/search.aspx?dsid=ZUCH002&AppGroup=DSAddin&host=Metadata&prev=scmTELCMBR&s=D&subset="
                    ToURL = BaseURL & ToURL
                    'Debug.Print ToURL

                    IE.Navigate ToURL
                    IE.Visible = True

                    Do While IE.Busy
                        Debug.Print "in busy loop"
                        Application.Wait DateAdd("s", 1, Now)
                    Loop

                    GoTo end_of_for
                End If
            Next
        End If
    Next

end_of_for:

    Debug.Print ("STOP STOP STOP STOP STOP")

    Dim Script As String

    For Each itm In IE.Document.all
        If itm.className = "lgc excel" Then
            Debug.Print "hello world"
            Debug.Print itm.getAttribute("onclick")
            itm.Click

            Do While IE.Busy
                Debug.Print "app busy"
                Application.Wait DateAdd("s", 1, Now)
            Loop

            Exit For

        End If
    Next

End Sub

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

使用此选项确定IE页面是否已满载,它必须始终为以下两种情况:

Do Until ie.ReadyState = 4 And ie.Busy = False
    DoEvents
Loop

即使上面有代码,如果页面上有脚本,可能会在满足ie.ReadyState = 4 And ie.Busy = False条件后加载某些内容,这可能很简单,但效率低且不可靠Application.Wait可以使用,或者您可以尝试在网站上查找有关加载状态的信息,并通过其可见属性等确定状态。

您的部分代码错误并导致无限循环:

Do While IE.ReadyState = 4: DoEvents: Loop
Do Until IE.ReadyState = 4: DoEvents: Loop

它使DoEvents在readystate完成时激活,直到它达到完全状态。

缩小所有元素的集合:

For Each itm In IE.Document.all

到特定集合以获得更好的性能,例如:

For Each itm In IE.Document.GetElementsByTagName("div")