使用getElementsByClassName命名和地址的VBA Web Scraping

时间:2019-05-31 21:15:20

标签: html vba excel-vba web-scraping

我正在尝试从以下网页中提取所有诊所的诊所名称和对应地址:https://medimap.ca/Location/Calgary,%20AB,%20Canada

我在确定应钻取的确切区域时遇到问题。所有诊所名称都具有相同的类别名称“ _1FLG5”,并且地址都为“ _1-Gov”。但是,当我运行下面的代码时,什么也没发生-没有错误,只有一无所获。

我也不确定.getElementsByClassName之后的引用是否正确,因为我希望来自与我引用“ _1FLG5”所在位置相同的行中的内部文本(0),并且因为我想要来自下面两行中的文本_1-Gov”(我引用了(2)。

Option Explicit

Sub GetClinicData()

    Dim objIE As InternetExplorer
    Dim clinicEle As Object
    Dim clinicAdd As Object

    Dim clinicName As String
    Dim address As String
    Dim y As Integer
    Dim x As Integer

    Set objIE = New InternetExplorer
    objIE.Visible = False

    objIE.navigate "https://medimap.ca/Location/Calgary,%20AB,%20Canada"
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    y = 1

    For Each clinicEle In objIE.document.getElementsByClassName("_1FLG5")
        clinicName = clinicEle.getElementsByClassName("_1FLG5")(0).innerText
        Sheets("Sheet1").Range("A" & y).Value = clinicName
        y = y + 1
    Next

    x = 1

    For Each clinicAdd In objIE.document.getElementsByClassName("_1-Gov")
        clinicAdd = clinicAdd.getElementsByClassName("_1-Gov")(2).innerText
        Sheets("Sheet1").Range("B" & x).Value = clinicAdd
        x = x + 1
    Next


End Sub

1 个答案:

答案 0 :(得分:0)

内容是动态加载的,因此您需要等待条件以确保内容已加载-否则您的集合的长度最终为0。我使用querySelectorAll来应用类名称,该类名称将返回For Loop的nodeList的.Length。理想情况下,您应该向循环添加超时条件。我在这里显示一个定时的loop

Option Explicit

'VBE > Tools > References: Microsoft Internet Controls
Public Sub GetData()
    Dim ie As Object
    Set ie = CreateObject("InternetExplorer.Application")
    With ie
        .Visible = True
        .Navigate2 "https://medimap.ca/Location/Calgary,%20AB,%20Canada"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Dim clinics As Object, addresses As Object, i As Long
        With .document

            Do
                Set clinics = .querySelectorAll("._1FLG5")
                Set addresses = .querySelectorAll("._1-Gov")
            Loop While clinics.Length = 0

            For i = 0 To clinics.Length - 1
                With ThisWorkbook.Worksheets("Sheet1")
                    .Cells(i + 1, 1) = Trim$(clinics.item(i).innerText)
                    .Cells(i + 1, 2) = Trim$(addresses.item(i).innerText)
                End With
            Next
        End With
        .Quit
    End With
End Sub