Excel Web Query Submit Issues

时间:2017-08-30 20:38:07

标签: html excel vba excel-vba web-scraping

I'm trying to get data (dollar exchange rate) from http://www4.bcb.gov.br/pec/taxas/port/ptaxnpesq.asp?id=txcotacao into a excel spreadsheet.

I've tried to paste as refreshable web query, however, the page always open one step earlier with a form, which has already default inputs (that works for me) and then the query copies stuff from this page.

I also tried to write a code to submit the form, but I didn't get it right. I tried the .submit, .Click, .FireEvent and many other things I found on internet.

I tried to refer to the button by its name, class, tag, ...

false

I also tried to trigger the form directly or bypass it, but didn't work

var invoice = Invoice.Get(apiContext, "INV2-XXXX-XXXX-XXXX");
string transactionId = invoice.payments.First().transaction_id;

Can anyone help me with this piece of code?

1 个答案:

答案 0 :(得分:1)

您可以使用bcb.gov.br Open Data Portal

使用Exchange rates – daily bulletins的转化率发送JSON回复请求。

通过收到的回复,以及其他方法,您可以:

  1. 使用JSON Converter并将响应转换为JSON对象并使用它;
  2. 将响应解析为带有正则表达式的字符串以获取值
  3. 查看site

    >今日结果的结果

    <强>输入:

    Input

    <强>输出:

    Rates

    结果:

    您可以看到1美元= 3,7048巴西雷亚尔

    使用JSON对象:

    要发出请求的示例字符串:

    "https://olinda.bcb.gov.br/olinda/service/PTAX/version/v1/odata/ExchangeRatePeriod(moeda=@moeda,dataInicial=@dataInicial,dataFinalCotacao=@dataFinalCotacao)?%40moeda=%27" & TARGET_CURRENCY & "%27&%40dataInicial=%27" & START_DATE & "%27&%40dataFinalCotacao=%27" & END_DATE & "%27&%24format=json"
    

    我在字符串中包含开始日期,结束日期和货币,并将响应格式指定为JSON。我选择的日期与上图中显示的网站视图相匹配。

    JSON响应如下:

    JSON response

    我将响应读入字符串变量,然后使用JsonConverter.ParseJson(strJSON)转换为存储在json变量中的JSON对象。快速检查结构:

    JSON structure

    开始&#34; {&#34;告诉我json是一本字典。

    dictionary

    我还可以看到json("value")是一个字典集合,我感兴趣的值3,7048 - 请记住上面的网站图片,存储为"cotacaoCompra"。< / p>

    因此我可以使用以下脚本来访问该值。 JSON响应实际上在该日期的5个不同时间给出了费率。这些都打印出来了。我们可以看到Fechamento(结案)公告3,7048的比赛。

    <强>代码:

    Option Explicit
    Public Sub GetInfo()
        Dim strURL As String, strJSON As String, item As Variant, http As Object, json As Object
        Const TARGET_CURRENCY As String = "USD"
        Const START_DATE As String = "06-13-2018"
        Const END_DATE As String = "06-13-2018"
    
        strURL = "https://olinda.bcb.gov.br/olinda/service/PTAX/version/v1/odata/ExchangeRatePeriod(moeda=@moeda,dataInicial=@dataInicial,dataFinalCotacao=@dataFinalCotacao)?%40moeda=%27" & TARGET_CURRENCY & "%27&%40dataInicial=%27" & START_DATE & "%27&%40dataFinalCotacao=%27" & END_DATE & "%27&%24format=json"
    
        Set http = CreateObject("MSXML2.XMLHTTP")
        http.Open "GET", strURL, False
        http.send
        strJSON = http.responseText
        Set json = JsonConverter.ParseJson(strJSON)
    
        For Each item In json("value")
            Debug.Print "rate " & item("cotacaoCompra") & " at " & item("dataHoraCotacao")
        Next item
    End Sub
    

    脚本输出:

    Script output

    备注:

    需要添加JSONConverter bas和VBE&gt;工具&gt;参考文献&gt; Microsoft Scripting RunTime)

    使用正则表达式解析responseText以获取费率:

    我将使用的正则表达式是

    "cotacaoCompra":\d{1,}.\d{1,}
    

    这会查找文字字符串"cotacaoCompra":,然后是一个或多个数字,然后是&#34;。&#34;,然后是一个或多个数字。

    Example matches

    然后我必须通过直接替换删除字符串"cotacaoCompra":。理想情况下,我只需用"(?<=""cotacaoCompra"":)\d{1,}.\d{1,}"提取数字;基本上,之后说,但不包括"cotacaoCompra":。但这似乎并不受支持。

    考虑到这一点,使用正则表达式获取费率的脚本:

    <强>代码:

    Public Sub GetInfo2()
    
        Dim strURL As String, strJSON As String, item As Variant, http As Object, json As Object
        Const TARGET_CURRENCY As String = "USD"
        Const START_DATE As String = "06-13-2018"
        Const END_DATE As String = "06-13-2018"
    
        strURL = "https://olinda.bcb.gov.br/olinda/service/PTAX/version/v1/odata/ExchangeRatePeriod(moeda=@moeda,dataInicial=@dataInicial,dataFinalCotacao=@dataFinalCotacao)?%40moeda=%27" & TARGET_CURRENCY & "%27&%40dataInicial=%27" & START_DATE & "%27&%40dataFinalCotacao=%27" & END_DATE & "%27&%24format=json"
    
        Set http = CreateObject("MSXML2.XMLHTTP")
        http.Open "GET", strURL, False
        http.send
        strJSON = http.responseText
        Dim Matches As Object
        With CreateObject("VBScript.RegExp")
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = """cotacaoCompra"":\d{1,}.\d{1,}"  'The pattern I really wanted, "(?<=""cotacaoCompra"":)\d{1,}.\d{1,}", doesn't appear to be supported
    
            If Not .test(strJSON) Then Exit Sub
            Set Matches = .Execute(strJSON)
    
            Dim match As Object
            For Each match In Matches
                Debug.Print Replace(match, """cotacaoCompra"":", vbNullString)
            Next
        End With
    End Sub