当scrollBy到达网页底部时如何退出循环?

时间:2018-05-01 12:38:55

标签: vba excel-vba internet-explorer web-scraping lazy-loading

我在VBA中编写了一个脚本,使用IE自动到达网页底部。网页以这样的方式显示它的内容,如果我向下滚动,更多的产品变得可见。我在我的脚本中使用.scrollBy来处理延迟加载。

我不明白当没有新产品需要加载时如何停止滚动 - 我在.scrollBy循环中使用了Do。当滚动完成并且浏览器到达网页底部时,如何退出循环?提前感谢任何解决方案。

这是我到目前为止所尝试的:

Sub HandleLazyload()
    Const URL As String = "https://www.inc.com/profile/sumup-payments-limited"
    Dim IE As New InternetExplorer, HTML As HTMLDocument, post As Object

    With IE
        .Visible = True
        .navigate URL
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set HTML = .document
    End With

    Do
        HTML.parentWindow.scrollBy 0, 99999
        Application.Wait Now + TimeValue("00:00:03")
        Set post = HTML.getElementsByTagName("article")
    Loop         ''I wish to break out of this loop when all the scrolling is done
    IE.Quit
End Sub

1 个答案:

答案 0 :(得分:1)

尝试以下我使用等级来确定循环的终止/退出。

Option Explicit

Public Sub HandleLazyload()
    Const URL As String = "https://www.inc.com/profile/sumup-payments-limited"
    Dim IE As New InternetExplorer, HTML As HTMLDocument
    With IE
        .Visible = True
        .navigate URL
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set HTML = .document
    End With

    Dim rank As Long, item As Long
    item = 1

    Do While Err.Number = 0
        HTML.parentWindow.scrollBy 0, 99999
        Application.Wait Now + TimeSerial(0, 0, 1)
        On Error GoTo errhand
        rank = Split(HTML.querySelectorAll(".rank dt ~ dd")(item).innerText, "#")(1)
        item = item + 1
    Loop

errhand:
    Err.Clear
    Debug.Print "Stopped at rank " & rank

    'Your other code
    'IE.Quit
End Sub

注意:

CSS选择器:

如果您想了解有关CSS选择器的更多信息

下面的选择器定位类名为rank的所有元素,然后在其中包含兄弟元素dtdd

HTML.querySelectorAll(".rank dt ~ dd")(item)

有针对性的HTML:

HTML element