单击提交按钮后,VBA从旧页面读取HTML

时间:2018-04-13 09:32:57

标签: html excel vba excel-vba web-scraping

我不是程序员,但我已经设法在VBA中学到了一些东西,但现在在某个网站上我遇到了一个其他问题不存在的问题。

应该发生的是,页面表单应该包含数据,点击提交按钮然后我想从结果页面获取一些数据。

第一阶段工作正常,但似乎无论我做什么,VBA仍然会在点击提交之前从页面读取数据。

代码是:

Sub VIES2()

'Uruchomienie Internet Explorera i wstrzymanie dalszej akcji aż uzyska stan gotowości
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"

Do While IE.ReadyState <> 4: DoEvents: Loop

'Wypełnienie formularza odpowiednimi wartościami i kliknięcie przycisku sprawdzenia
IE.document.getElementbyId("countryCombobox").Value = "IT"
IE.document.getElementbyId("number").Value = "01802840023"
IE.document.getElementbyId("requesterCountryCombobox").Value = "IT"
IE.document.getElementbyId("requesterNumber").Value = "01802840023"
IE.document.getElementbyId("submit").Click

'Test uzyskiwania opisu i identyfikatora zapytania

For t = 1 To 999999
Next t

Application.Wait Now + TimeValue("00:00:10")

Do While IE.ReadyState <> 4: DoEvents: Loop

For t = 1 To 999999
Next t

Application.Wait Now + TimeValue("00:00:10")

MsgBox IE.LocationURL

Set Text = IE.document.getElementsbyClassName("layout-content")

For Each Element In Text
MsgBox Element.innerText
Next

Set Test = IE.document.getElementsbyTagName("TABLE")

For Each Element In Test
MsgBox Element.innerText
Next

End Sub

我已经尝试过将break,各种等待循环和Application.Wait建议在类似的问题中建议它似乎有效。在这里,即使在完全加载后页面很长,代码仍然会读取旧页面 - 至少拉动URL,一些数据似乎表明情况就是这样。

更新:我还应该补充一点,我试图让宏刷新页面,但它会清除输入内容。有趣的是,目标网址是:

http://ec.europa.eu/taxation_customs/vies/vatResponse.html

如果我将初始页面更改为此页面,浏览器会立即重定向到原始页面,并通知需要初始数据。宏然后完成数据并单击提交按钮。在这种情况下,IE.LocationURL表示此URL:

http://ec.europa.eu/taxation_customs/vies/vatResponse.html

但根据我使用getElementsbyClassName获取的内容仍然从初始页面读取元素:

http://ec.europa.eu/taxation_customs/vies/?locale=pl

3 个答案:

答案 0 :(得分:2)

这可以打印出增值税回复表

注意:

如果在32位上删除PtrSafe

<强>代码:

Option Explicit
Declare PtrSafe Sub sleep Lib "kernel32" Alias "Sleep" (ByVal dwmilliseconds As Long)

Public Sub VIES2()
    Application.ScreenUpdating = False
    Dim IE As Object

    'Uruchomienie Internet Explorera i wstrzymanie dalszej akcji az uzyska stan gotowosci
    Set IE = CreateObject("InternetExplorer.Application")
    IE.Visible = True
    IE.navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"

    Do While IE.ReadyState <> 4: DoEvents: Loop

    'Wypelnienie formularza odpowiednimi wartosciami i klikniecie przycisku sprawdzenia
    IE.document.getElementById("countryCombobox").Value = "IT"
    IE.document.getElementById("number").Value = "01802840023"
    IE.document.getElementById("requesterCountryCombobox").Value = "IT"
    IE.document.getElementById("requesterNumber").Value = "01802840023"
    IE.document.getElementById("submit").Click

    sleep (5000) 'or increase to 10000
    Dim tbl  As Object

    Set tbl = IE.document.getElementById("vatResponseFormTable")

    Dim ws As Worksheet
    Set ws = ActiveWorkbook.Worksheets.Add
    ws.Name = "Results"
    Dim rng As Range, currentRow As Object, currentColumn As Object, i As Long, outputRow As Long

        outputRow = outputRow + 1
        Set rng = ws.Range("B" & outputRow)

        For Each currentRow In tbl.Rows
            For Each currentColumn In currentRow.Cells
                rng.Value = currentColumn.outerText
                Set rng = rng.Offset(, 1)
                i = i + 1
            Next currentColumn
            outputRow = outputRow + 1
            Set rng = rng.Offset(1, -i)
            i = 0
        Next currentRow
        Application.ScreenUpdating = True
End Sub

<强>输出:

Output

答案 1 :(得分:1)

虽然QHarr的解决方案在我的最终工作,但我提供了另一个没有硬编码延迟的脚本。

使用IE作为您的问题是:

Sub Get_Data()
    Dim HTML As HTMLDocument, post As Object, elems As Object
    Dim elem As Object, r&, c&

    With New InternetExplorer
        .Visible = False
        .navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set HTML = .document

        With HTML
            .getElementById("countryCombobox").Value = "IT"
            .getElementById("number").Value = "01802840023"
            .getElementById("requesterCountryCombobox").Value = "IT"
            .getElementById("requesterNumber").Value = "01802840023"
            .getElementById("submit").Click

            Do: Set post = .getElementById("vatResponseFormTable"): DoEvents: Loop While post Is Nothing

            For Each elems In post.Rows
                For Each elem In elems.Cells
                    c = c + 1: Cells(r + 1, c) = elem.innerText
                Next elem
                c = 0: r = r + 1
            Next elems
        End With
        .Quit
    End With
End Sub

参考添加到库:

1. Microsoft Internet Controls
2. Microsoft HTML Object Library

使用xmlhttp请求(它比IE快):

Sub Get_Data()
    Dim elems, elem As Object
    Dim QueryString$, S$, r&, c&

    QueryString = "memberStateCode=IT&number=01802840023&traderName=&traderStreet=&traderPostalCode=&traderCity=&requesterMemberStateCode=IT&requesterNumber=01802840023&action=check&check=Weryfikuj"

    With New XMLHTTP
        .Open "POST", "http://ec.europa.eu/taxation_customs/vies/vatResponse.html", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
        .send QueryString
        S = .responseText
    End With

    With New HTMLDocument
        .body.innerHTML = S

        For Each elems In .getElementById("vatResponseFormTable").Rows
            For Each elem In elems.Cells
                c = c + 1: Cells(r + 1, c) = elem.innerText
            Next elem
            c = 0: r = r + 1
        Next elems
    End With
End Sub

参考添加到库:

1. Microsoft XML, V6
2. Microsoft HTML Object Library

答案 2 :(得分:0)

大多数情况下,您应该搜索是否有可用于实现此类任务的REST / SOAP。 使用SOAP实例是完全矫枉过正的。

尝试使用Function IsVatValid(country_code, vat_number) Dim objHTTP As Object Dim xmlDoc As Object Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1") sURL = "http://ec.europa.eu/taxation_customs/vies/services/checkVatService" sEnv = "<s11:Envelope xmlns:s11='http://schemas.xmlsoap.org/soap/envelope/'>" & _ "<s11:Body>" & _ "<tns1:checkVat xmlns:tns1='urn:ec.europa.eu:taxud:vies:services:checkVat:types'>" & _ "<tns1:countryCode>" & country_code & "</tns1:countryCode>" & _ "<tns1:vatNumber>" & vat_number & "</tns1:vatNumber>" & _ "</tns1:checkVat>" & _ "</s11:Body>" & _ "</s11:Envelope>" objHTTP.Open "Post", sURL, False objHTTP.setRequestHeader "Content-Type", "text/xml" objHTTP.setRequestHeader "SOAPAction", "checkVatService" objHTTP.send (sEnv) objHTTP.waitForResponse Set xmlDoc = CreateObject("HTMLFile") xmlDoc.body.innerHTML = objHTTP.responsetext IsVatValid = CBool(xmlDoc.getElementsByTagName("valid")(0).innerHTML) Set xmlDoc = Nothing Set objHTTP = Nothing End Function 服务验证增值税号码的简单功能:

Debug.Print IsVatValid("IT", "01802840023")
>>> True

然后您可以简单地验证所有增值税号码:

{{1}}