如何使用“分页”单击网络中的某个页面?

时间:2019-06-04 03:37:04

标签: excel vba web web-scraping pagination

我有一个过程可以打开一个网页,并计算其中有多少页。然后,我需要单击某个页面,例如数字3。有人知道如何做吗?我在下面的宏代码详细说明。非常感谢你!

Sub test()

    Dim element As IHTMLElement
    Dim elements As IHTMLElementCollection
    Dim ie As InternetExplorer
    Dim numberOfPages As Double
    Dim html As HTMLDocument

    Set ie = New InternetExplorer
    ie.Visible = True
    ie.navigate "https://cebra.com.ar/category/73/Juego-de-Construccion.html"
    Do While ie.readyState <> READYSTATE_COMPLETE
        Application.StatusBar = "Loading Web page …"
        DoEvents
    Loop
    Set html = ie.document
    Set elements = html.getElementsByClassName("container")
    Set ElementCol = html.getElementsByTagName("a")
    numberOfPages = ie.document.querySelectorAll(".setPage").Length
    'Here I want to click on a certain page, for example, number 3
    For Each ele In ie.document.getElementsByTagName("li")
        For Each element In elements
            If element.className = "container" Then
                'Do something
            End If
        Next element
    Next
    MsgBox "Done"

End Sub

2 个答案:

答案 0 :(得分:3)

尝试以下代码

Option Explicit

Sub test()

    Dim ie As InternetExplorer
    Dim html As HTMLDocument
    Dim element As IHTMLElement

    Set ie = New InternetExplorer
    ie.Visible = True
    ie.Navigate "https://cebra.com.ar/category/73/Juego-de-Construccion.html"
    WaitIE ie
    Set html = ie.Document
    Set element = html.querySelector("a.setPage")
    element.setAttribute "data-value", "3" ' set the page number you want open
    element.click
    WaitIE ie

    MsgBox "Done"

End Sub

Sub WaitIE(ie As InternetExplorer)

    Application.StatusBar = "Loading Web page …"
    Do While ie.readyState <> READYSTATE_COMPLETE
        DoEvents
    Loop
    Application.StatusBar = "Ready"

End Sub

答案 1 :(得分:0)

假设您要循环播放所有页面(还要显示如何选择给定页面),则需要检查什么是当前活动页面,然后循环其他页面。需要等待条件以及错误处理,以避免过时的元素异常冒起,因为权限被拒绝以及找不到元素。


说明:

currentPage = CLng(.querySelector(".active").innerText)

查找当前活动页面。当前活动页面的类属性值为active.是CSS class selector

If page <> currentPage Then

忽略所有页面中循环中的当前活动页面。

cssSelector = ".setPage[data-value='" & page & "']"

通过将页码连接到data-value属性中来定义选择器以选择任何给定的页面。它与.中的类选择器setPage结合使用,以限制用于页面选择的相关元素:

[data-value='pageNumber']pageNumber是示例页面,例如2,是CSS attribute = value selector

如果您查看以下html:

enter image description here

您可以看到,如果当前活动页面为1,则存在类属性值= active,而没有data-value属性。要选择第二页,您可以看到当前的类值为setPage,而data-value2

.querySelector(cssSelector).Click

选择循环中的下一页。

Do
    On Error Resume Next
    num = CLng(.querySelector(".active").innerText)
    On Error GoTo 0
Loop Until num = page

循环播放,直到所选页码的类属性值处于活动状态,并且其innerText值==所需的页面,即当num = 2时选择第2页,从而html更改为:

enter image description here

您可以使用多种方式在不同的条件下重写此循环,但是这种方式很好用。


VBA:

Option Explicit

'VBE > Tools > References: Microsoft Internet Controls
Public Sub GetData()
    Dim ie As Object, numberOfPages As Long, currentPage As Long, page As Long
    Set ie = CreateObject("InternetExplorer.Application")
    With ie
        .Visible = True
        .Navigate2 "https://cebra.com.ar/category/73/Juego-de-Construccion.html"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document
            numberOfPages = .querySelectorAll(".setPage").Length
            currentPage = CLng(.querySelector(".active").innerText)
            Dim cssSelector As String, num As Long
            For page = 1 To numberOfPages
                If page <> currentPage Then
                    cssSelector = ".setPage[data-value='" & page & "']"
                    .querySelector(cssSelector).Click
                    Do
                        On Error Resume Next
                        num = CLng(.querySelector(".active").innerText)
                        On Error GoTo 0
                    Loop Until num = page
                    While ie.Busy Or ie.readyState < 4: DoEvents: Wend
                End If
            Next
        End With
        .Quit
    End With
End Sub