将InternetExplorer转换为XML

时间:2019-04-07 17:13:56

标签: excel xml vba

我刚刚发现了xmlxmlhttp,这对我来说是全新的。 我正在尝试创建一个宏,该宏将遍历J列中所有从第2行(第1行的标题)开始的网站列表。从每个网站获取我想要的信息,然后将其显示在获取信息的网站旁边的K列中。

J列具有从J2开始的网站列表。假设它将一直下降到J10。从每个网站上,都有一些我想获取的信息,因此宏将访问位于J2的网站,获取该信息并将其粘贴到K2中,然后访问位于J3的网站,将该信息粘贴到K3中,依此类推。我已经在J列中有了一个现有的网站列表,该列表也很动态。

这是我要转换为xml / xmlhttp的IE浏览器的当前代码。

Sub CommandButton1_Click()
    Dim ie As Object
    Dim lastrow As Integer
    Dim i As Integer
    Dim myURL As String
    Dim sdd As String
    Dim add As Variant
    Dim html As Object
    Dim mylinks As Object
    Dim mylink As Object
    Dim result As String

    ' Create InternetExplorer Object
    Set ie = CreateObject("InternetExplorer.Application")

    lastrow = Sheet1.Cells(Rows.Count, "J").End(xlUp).Row
    For i = 2 To lastrow
    myURL = Sheet1.Cells(i, "J").Value

    ' Hide InternetExplorer
    ie.Visible = False

    ' URL to get data from
    ie.navigate myURL

    ' Loop until page fully loads
    Do While ie.readystate <> READYSTATE_COMPLETE
    Loop

    ' Information i want to get from the URLs
    sdd = ie.document.getelementsbyclassname("timeline-text")(0).innerText

    ' Format the result
    add = Split(sdd, "$")
    Range("K3") = add(1)

    ' Close InternetExplorer
    ie.Quit

    'Return to Normal?
    ie.Visible = True
    End
    Next
    ' Clean up
    Set ie = Nothing

    Application.StatusBar = ""

End Sub

我试图获得“ 85100”,而不是$ 85,100

<span class="font-size-base font-normal">Est.</span>
<span itemprop="price" content="85100">
$85,100
</span>

我希望你能帮助我解决这个问题。

谢谢。

1 个答案:

答案 0 :(得分:0)

我将构建如下所示的结构,其中IE对象是在循环外部创建的。您始终使用css选择器。您可能需要定时循环以确保元素出现在页面上。如图所示,使用适当的页面加载等待时间。

使用显式工作表名称将工作表放入要使用的变量中。

您可能要添加一个myURL包含http / https的测试,因为您的范围可能为空白单元格,并且只希望使用可能的url值。

Option Explicit   
Public Sub CommandButton1_Click()
    Dim ie As Object, lastrow As Long, i As Long
    Dim myURL As String, sdd As String, ws As Worksheet

    Set ws = ThisWorkbook.Worksheets("Sheet1")   ' <change as required  
    Set ie = CreateObject("InternetExplorer.Application")
    lastrow = ws.Cells(Rows.Count, "J").End(xlUp).Row

    With ie
        .Visible = False

        For i = 2 To lastrow

            myURL = ws.Cells(i, "J").Value

            .navigate2 myURL

            While .Busy Or .readyState < 4: DoEvents: Wend

            sdd = .document.querySelector(".price").getAttribute("content")
            ws.Cells(i, "K") = sdd

        Next
        .Quit
    End With
    'Application.StatusBar = ""
End Sub

定时循环:

Public Sub CommandButton1_Click()
    Dim ie As Object, lastrow As Long, i As Long, t As Date, ele As Object
    Const MAX_WAIT_SEC As Long = 10
    Dim myURL As String, sdd As String, ws As Worksheet

    Set ws = ThisWorkbook.Worksheets("Sheet1")   ' <change as required
    Set ie = CreateObject("InternetExplorer.Application")
    lastrow = ws.Cells(rows.Count, "J").End(xlUp).Row

    With ie
        .Visible = False

        For i = 2 To lastrow

            myURL = ws.Cells(i, "J").Value

            .Navigate2 myURL

            While .Busy Or .readyState < 4: DoEvents: Wend
            t = Timer
            Do
                On Error Resume Next
                Set ele = HTMLDoc.querySelector(".price")
                On Error GoTo 0
                If Timer - t > MAX_WAIT_SEC Then Exit Do
            Loop While ele Is Nothing

            If Not ele Is Nothing Then
                sdd = ele.getAttribute("content")
                ws.Cells(i, "K") = sdd
            End If

        Next
        .Quit
    End With
    'Application.StatusBar = vbnullstring
End Sub