VBA - 点击网站上的按钮并将数据文件下载到Excel中

时间:2017-01-17 04:23:41

标签: excel vba excel-vba internet-explorer web-scraping

我正在尝试从网站下载数据并将其放入我的Excel文件中。

以下是我正在尝试做的步骤:

1)访问网站:http://www.housepriceindex.ca/default.aspx

2)点击“下载历史数据(.xls)”(我被困在这里)

3)输入底部的电子邮件地址(abc@abc.com)

4)点击接受

5)将刚刚下载的.xls文件中的数据传输到我的文件中。

目前为止的代码

Sub GetData()


Dim i As Long
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True
IE.Navigate "http://www.housepriceindex.ca/default.aspx" 

Do While IE.Busy: DoEvents: Loop
Do Until IE.ReadyState = READYSTATE_COMPLETE: DoEvents: Loop

IE.Document.GetElementByID(lnkTelecharger2).Click


End Sub

感谢任何帮助!

2 个答案:

答案 0 :(得分:1)

所以这就是我要做的事情:

Sub GetData()
Dim IE As InternetExplorer
Dim HTMLDoc As HTMLDocument
Dim objElement As HTMLObjectElement

Set IE = New InternetExplorer
With IE
    .Visible = True
    .Navigate "http://www.housepriceindex.ca/default.aspx"
    While .Busy Or .ReadyState <> READYSTATE_COMPLETE: Wend
    .Document.getElementById("lnkTelecharger2").Click
    While .Busy Or .ReadyState <> READYSTATE_COMPLETE: Wend
    Set HTMLDoc = .Document
    Set objElement = HTMLDoc.getElementById("txtEmailDisclaimerEN")
    objElement.Value = "Email Address"
    Set objElement = HTMLDoc.getElementById("lnkAcceptDisclaimerEN")        
    objElement.Click

    ' ... Get CSV somehow ...

    .Quit
End With

Set IE = Nothing
End Sub

现在显然仍然存在捕获CSV的问题 - 我尝试了许多不同的方法将工作簿变量设置为CSV而没有运气...问题是您单击的按钮不会出现问题; t包含您要下载的CSV的URL,因此除非其他人知道如何捕获提示打开或保存的文件,否则我可以帮助您..

此致 TheSilkRoad

答案 1 :(得分:1)

您可以通过按钮的类名(button small download)来定位按钮:

IE.document.querySelector(".button.small.download").Click

IE.document.getElementsByClassName("button small download")(0).Click

但是:

如果您了解使用JSON的方式,则可以完全避免登录。您要单击的按钮上方有一个JSON链接:

data-data_url:

data

您可以将其与XMLHTTPRequest结合使用以获取JSON数据,然后使用诸如JSONConverter.之类的工具解析响应。将.bas添加到项目中之后,您需要转到VBE>工具>参考,并添加对Microsoft脚本运行时的引用。

这里只是一个概述,显示了设置初始JSON对象并提取一些信息的过程。

Option Explicit
Public Sub GetInfo()
    Dim strURL As String, strJSON As String, Http As Object, json As Object
    Application.ScreenUpdating = False
    strURL = "https://housepriceindex.ca/_data/indx_data.json?d=4dfb05da"

    Set Http = CreateObject("MSXML2.XMLHTTP")
    With Http
        .Open "GET", strURL, False
        .send
        strJSON = .responseText
    End With
    Set json = JsonConverter.ParseJson(strJSON)

    Dim key As Variant, dictKeys As Variant
    '****************************************
    ' Set json = json("data") ' Array("indx", "spc", "indx_ch", "spc_ch", "Meta", "Data") '<== These are the keys in that dict.

    Set json = json("profiles")                  ' Array("c11", "mc","ab_calgary","ab_edmonton","bc_abbotsford","bc_kelowna" , _
                                                 "bc_vancouver","bc_victoria","mb_winnipeg","ns_halifax","on_barrie" , _
                                                 "on_brantford","on_guelph","on_hamilton","on_kingston","on_kitchener", _
                                                 "on_london","on_oshawa","on_ottawa","on_peterborough","on_st_catharines" , _
                                                 "on_sudbury","on_thunder_bay","on_toronto","on_windsor","qc_montreal","qc_quebec_city") '<==Keys in profile dict

    Dim dict As Object, rowNumber As Long
    Set dict = json("qc_montreal")

    With ThisWorkbook.Worksheets("Sheet1")
        For Each key In dict
             rowNumber = rowNumber + 1
            .Cells(rowNumber, 1) = key
            .Cells(rowNumber, 2) = dict(key)
        Next key
    End With
    Application.ScreenUpdating = True
End Sub

样本工作表输出:

Data sample


示例源JSON:

Sample

第二级字典中的C11信息很多,位于顶级字典("JSON")键"data"下。

data