应用xpath来使用vba解析xml文档

时间:2017-09-30 14:32:31

标签: vba xpath web-scraping xml-parsing xmldom

今天我遇到了一个博客,在那里我找到了一个演示,其中展示了如何使用vba解析从xml文档应用xpath的项目。如果可以从网站上做同样的话,那将是非常棒的。

以下是从本地保存的文件中完成的操作:

Sub XML_Parsing()
    Dim xml As Object, post As Object

    Set xml = CreateObject("MSXML2.DOMDocument")
    xml.async = False: xml.validateOnParse = False
    xml.Load (ThisWorkbook.Path & "\htdocs.txt")

     For Each post In xml.SelectNodes("//DistributionLists/List")
        x = x + 1: Cells(x, 1) = post.SelectNodes(".//Name")(0).Text
        Cells(x, 2) = post.SelectNodes(".//TO")(0).Text
        Cells(x, 3) = post.SelectNodes(".//CC")(0).Text
        Cells(x, 4) = post.SelectNodes(".//BCC")(0).Text
    Next post
End Sub

以上代码应该应用于Desktop中保存的名为“htdocs.txt”的文本文件。

<?xml version="1.0" encoding="utf-8"?>
<DistributionLists>
    <List>
        <Name>Recon</Name>
        <TO>John;Bob;Rob;Chris</TO>
        <CC>Jane;Ashley</CC>
        <BCC>Brent</BCC>
    </List>
    <List>
        <Name>Safety Metrics</Name>
        <TO>Tom;Casper</TO>
        <CC>Ashley</CC>
        <BCC>John</BCC>
    </List>
    <List>
        <Name>Performance Report</Name>
        <TO>Huck;Ashley</TO>
        <CC>Tom;Andrew</CC>
        <BCC>John;Seema</BCC>
    </List>
</DistributionLists>

提取的结果:

Recon   John;Bob;Rob;Chris  Jane;Ashley Brent
Safety Metrics  Tom;Casper  Ashley  John
Performance Report  Huck;Ashley Tom;Andrew  John;Seema

现在,我有两个问题:

1. How to parse the same from a website as i did above, as in "example.com"? If it was "html element" then i could load like "html.body.innerHTML = http.responsetext" but in this case what should be the process?
2. If i do the above thing using EARLY BINDING: what should be the reference to add to the library?

1 个答案:

答案 0 :(得分:1)

似乎已经找到了解决方案。这是:

Sub XML_Parsing()
    Dim http As New XMLHTTP60
    Dim xmldoc As Object, post As Object

    With http
        .Open "GET", "http://wservice.viabicing.cat/v1/getstations.php?v=1", False
        .send
        Set xmldoc = .responseXML
        xmldoc.LoadXML .responseXML.xml
    End With

     For Each post In xmldoc.SelectNodes("//station")
        x = x + 1: Cells(x, 1) = post.SelectNodes(".//lat")(0).Text
        Cells(x, 2) = post.SelectNodes(".//long")(0).Text
    Next post
End Sub

部分结果:

$41.40  2.180042
41.39553    2.17706
41.393699   2.181137
41.39347    2.18149