无法从xml内容中提取链接

时间:2018-07-17 12:17:25

标签: xml vba excel-vba web-scraping

我在vba中编写了一个脚本,以使不同节点内的所有链接脱离站点地图链接,但无法成功进行。

如何获取这些链接?

这是我已经尝试过的:

Sub TestXML()
    Dim Http As New XMLHTTP60, Xmldoc As Object
    Dim post As Object, R&

    With Http
        .Open "GET", "https://www.klerenmakendebaby.nl/product-sitemap.xml", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        Set Xmldoc = CreateObject("MSXML2.DOMDocument")
        Xmldoc.LoadXML .responseXML.xml
    End With

    For Each post In Xmldoc.SelectNodes("//url")
        R = R + 1: Cells(R, 1) = post.SelectNodes(".//loc")(0).Text
    Next post
End Sub

执行时不获取任何内容,也不抛出任何错误。

2 个答案:

答案 0 :(得分:1)

根据以下内容?为您的版本添加对Microsoft XML库的引用。我在Excel 2016上,所以使用xml 6.0。和文件60。

Option Explicit
Public Sub TestXML()
    Dim Http As New XMLHTTP60, Xmldoc As New MSXML2.DOMDocument60, R&, aNodeList As Object, bNode As IXMLDOMNode
    Application.ScreenUpdating = False
    With Http
        .Open "GET", "https://www.klerenmakendebaby.nl/product-sitemap.xml", False
        .setRequestHeader "User-Agent", "Mozilla/5.0" 
        .send
        Xmldoc.LoadXML .responseText
    End With
    Set aNodeList = Xmldoc.DocumentElement.SelectNodes("//loc")
    For Each bNode In aNodeList.Context.ChildNodes
        R = R + 1: Cells(R, 1) = bNode.FirstChild.Text
    Next bNode
    Application.ScreenUpdating = True
End Sub

答案 1 :(得分:1)

另一种实现此目的的方法是:

Sub TestXML()
    Dim Http As New XMLHTTP60
    Dim Xdoc As New DOMDocument, post As Object, R&

    With Http
        .Open "GET", "https://www.klerenmakendebaby.nl/product-sitemap.xml", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        Xdoc.LoadXML .responseText
    End With

    For Each post In Xdoc.getElementsByTagName("url")
        R = R + 1: Cells(R, 1) = post.getElementsByTagName("loc")(0).Text
    Next post
End Sub