Excel VBA - 通过MSXML2.XMLHTTP的HttpReq - 加载页面后获取页面

时间:2013-06-21 08:13:58

标签: vba excel-vba httpwebrequest xmlhttprequest excel

我从基于内部网络的Dataservice(cognos)获取数据时遇到问题。 基本上我把一个GET请求放在一起,比如“blah.com/cognosapi.dll?product=xxx&date=yyy ...”,将它发送到服务器并接收一个我可以存储为的网页HTML并稍后解析为我的excel表单。

我构建了一个过去运行良好的VBA程序,但是web服务现在改变了,它们正在显示“你的报告正在运行”页面,介于1秒到30秒之间。因此,当我调用我的功能时,我总是下载这个“你的报告正在运行”的数据页面。如何在“报表运行”页面后捕获自动加载的页面?

这是具有GETstring和目标路径作为参数的DownloadFile函数。

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean


Dim HttpReq As Object
Set HttpReq = CreateObject("MSXML2.XMLHTTP")

Dim HtmlDoc As New MSHTML.HTMLDocument


HttpReq.Open "GET", sSourceUrl, False
HttpReq.send


If HttpReq.Status = 200 Then
    HttpReq.getAllResponseHeaders
    HtmlDoc.body.innerHTML = HttpReq.responseText
    Debug.Print HtmlDoc.body.innerHTML

End If

  'Download the file. BINDF_GETNEWESTVERSION forces
  'the API to download from the specified source.
  'Passing 0& as dwReserved causes the locally-cached
  'copy to be downloaded, if available. If the API
  'returns ERROR_SUCCESS (0), DownloadFile returns True.

  DownloadFile = URLDownloadToFile(0&, _
                                    sSourceUrl, _
                                    sLocalFile, _
                                    BINDF_GETNEWESTVERSION, _
                                    0&) = ERROR_SUCCESS

End Function

由于 大卫

2 个答案:

答案 0 :(得分:1)

最后你给了我解决问题的最后一个链接。我将代码烘焙到我的DownloadFile函数中以保持IE对象直到结束然后关闭它。

我发现的一个错误是在使用HTMLObject完成任何操作之前应该轮询readystate。

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean

Dim IE As InternetExplorer
Set IE = New InternetExplorer



Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement

Dim fnum As Integer

With IE
    'May changed to "false if you don't want to see browser window"
    .Visible = True   
    .Navigate (sSourceUrl)
    'this waits for the page to be loaded
     Do Until .readyState = 4: DoEvents: Loop 
End With

'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
    Set HtmlDoc = IE.Document

    'searching for the "Span" tag
    Set collSpans = HtmlDoc.getElementsByTagName("span") 

   'first Span element cotains...
    Set objSpanElem = collSpans(0) 

'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running." 

'just grab the tables and leave the rest    
Set collTables = HtmlDoc.getElementsByTagName("table") 

fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"

Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements

Print #fnum, "</body>"
Print #fnum, "</html>"

Close #fnum
IE.Quit 'close Explorer

DownloadFile = True

End Function

答案 1 :(得分:0)

由于您正在使用GET请求,我假设可以在URL字符串中提供任何必需的参数。在这种情况下,您可以使用InternetExplorer.Application,它会在页面刷新时自动更新其Document属性。然后,您可以设置一个循环,定期检查所需页面的某些值(标记文本,URL等)。

以下是加载网址的示例,然后等待网页的<title>标记为所需的值。

Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)

Function wait_for_html(strURL as String, strDesiredText as String) as String

    Dim IE As InternetExplorer
    Set IE = New InternetExplorer

    IE.Navigate (strURL)

    While IE.ReadyState <> 4
        Sleep 10
    Wend

    Dim objHtml As MSHTML.HTMLDocument
    Dim collTitle As MSHTML.IHTMLElementCollection
    Dim objTitleElem As MSHTML.IHTMLTitleElement

    Do
        Sleep 1000
        Set objHtml = IE.Document
        Set collTitle = objHtml.getElementsByTagName("title")
        Set objTitleElem = collTitle(0)

    Loop Until objTitleElem.Text = strDesiredText

    wait_for_html = objHtml.body.innerHTML

End Function

以上需要引用Microsoft Internet Controls和Microsoft HTML Object Library。