VBA解析XML嵌套节点

时间:2018-03-16 16:43:33

标签: xml excel-vba vba excel

我不熟悉在Excel VBA中解析XML文件。来自其他论坛&我已经能够将以下VBA代码放在一起解析XML文件中的元素并将它们放在Excel工作表上。但是,在XML文件的各个节点中,文本字段被分成多个部分并分配序列号。

以下代码解析XML字段(每个记录一行)。有人可以告诉我如何检索所有文本部分而不仅仅是第一个文本部分?

其次,文本字段的各个部分并不总是按顺序排列。零件是否可以按顺序编号重新组装?

这是我的名义XML文件:

    <?xml version="1.0" encoding="UTF-8"?>
    -<Station>
    -<Action>
    <ID>A001</ID>
    <Type>AS</Type>
    <ActionDescInfo>
      <Sequence>01</Sequence>
      <Desc>Text block one.</Desc>
    </ActionDescInfo>
    </Action>
    -<Action>
    <ID>A002</ID>
    <Type>AP</Type>
    <ActionDescInfo>
      <Sequence>01</Sequence>
      <Desc>Another text block</Desc>
      <Sequence>02</Sequence>
      <Desc>Text following first one</Desc>
    </ActionDescInfo>
    </Action>
    -<Action>
    <ID>A003</ID>
    <Type>AS</Type>
    <ActionDescInfo>
      <Sequence>03</Sequence>
      <Desc>This comes third and is out of place</Desc>
      <Sequence>01</Sequence>
      <Desc>this one is first</Desc>
      <Sequence>02</Sequence>
      <Desc>This text is second. But all the same paragraph</Desc>
    </ActionDescInfo>
    </Action>
    </Station>

到目前为止,这是守则:

    Sub XMLShare()
    'Public Sub Xml_To_Excel()
    Dim osh As Worksheet
    Dim WinHttpReq As Object
    Dim xmlDoc As Object
    Dim nodeXML1 As Object
    Dim nodeXML2 As Object
    Dim nodeXML3 As Object
    Dim nodeXML4 As Object
    Dim i As Integer, oRow As Integer
    ' Clear Sheet
     Worksheets("Sheet1").Select
     Cells.Select
     Selection.Clear
    ' Variables
     oRow = 1
    ' Sets
    Set osh = ThisWorkbook.Sheets("Sheet1")
    Set xmlDoc = CreateObject("Microsoft.XMLDOM")
    xmlDoc.setProperty "SelectionLanguage", "XPath"
    xmlDoc.async = False
    xmlDoc.Load ("filename.xml")
    Set nodeXML1 = xmlDoc.getElementsByTagName("ID")
    Set nodeXML2 = xmlDoc.getElementsByTagName("Type")
    Set nodeXML3 = xmlDoc.getElementsByTagName("Sequence")
    Set nodeXML4 = xmlDoc.getElementsByTagName("Desc")
    ' Processing Loop
      For i =0 To nodeXML1.Length - 1
        oRow = oRow + 1
       osh.Range("A" & oRow) = nodeXML1(i).Text        ' ID
       osh.Range("B" & oRow) = nodeXML2(i).Text    ' Type
       osh.Range("C" & oRow) = nodeXML3(i).Text    ' Sequence
       osh.Range("D" & oRow) = nodeXML4(i).Text    ' Desc
      Next
    ' Finish
        MsgBox "Process Completed"
    End Sub

这是我期望的结果:

    ID  Type    Desc
    001 AS  Text block one.
    002 AP  Another text block that follows the first one.
    003 AS  This one is first. This text is second. But all the
                    same paragraph. This comes third and is out of place.

1 个答案:

答案 0 :(得分:0)

这可能是一个解决方案。首先,您需要将ActionDescInfo用作节点。在该节点中,您可以使用新循环来遍历所有序列和desc元素。使用数组,您可以在其中设置正确的序列:

Sub XMLShare()
    'Public Sub Xml_To_Excel()'
    Dim osh As Worksheet
    Dim WinHttpReq As Object
    Dim xmlDoc As Object
    Dim nodeXML1 As Object
    Dim nodeXML2 As Object
    Dim nodeXML3 As Object
    Dim nodeXML4 As Object
    Dim i As Integer, oRow As Integer
    Dim SequenceArr() As String
    Dim DescArr() As String
    ' Clear Sheet'
ActiveSheet.UsedRange.Clear
    ' Variables'
     oRow = 1
    ' Sets'
    Set osh = ThisWorkbook.Sheets(1)
    Set xmlDoc = CreateObject("Microsoft.XMLDOM")
    xmlDoc.SetProperty "SelectionLanguage", "XPath"
    xmlDoc.async = False
    xmlDoc.Load ("C:\Users\Alex\Desktop\test.xml")
    Set nodeXML1 = xmlDoc.getElementsByTagName("ID")
    Set nodeXML2 = xmlDoc.getElementsByTagName("Type")
    Set nodeXML3 = xmlDoc.getElementsByTagName("ActionDescInfo")
    Set nodeXML4 = xmlDoc.getElementsByTagName("Desc")
    ' Processing Loop'
    For i = 0 To nodeXML1.Length - 1
        x = 0
        oRow = oRow + 1
        osh.Range("A" & oRow) = nodeXML1(i).Text        ' ID'
        osh.Range("B" & oRow) = nodeXML2(i).Text    ' Type'
        'Loop through the childnodes of ActionDescInfo'
        For j = 0 To nodeXML3(i).ChildNodes.Length - 1
            'If the childnode is sequence, find out the order and put the values in an array'
            If nodeXML3(i).ChildNodes(j).BaseName = "Sequence" Then
                'if x = 0, the cell is empty and just fill the first element of the array'
                'Else, loop through each element in the array to check if the number is larger or smaller'
                If x > 0 Then
                    For k = 0 To UBound(SequenceArr)
                        'If the value in the array is larger, this is the element where the new sequence number needs to be filled in'
                        If SequenceArr(k) > nodeXML3(i).ChildNodes(j).Text Then
                            Exit For
                        End If
                    Next k
                Else
                    k = 0
                End If
                'Move all elements one up and put the sequence number on the right place'
                ReDim Preserve SequenceArr(x)
                For l = k To UBound(SequenceArr) - 1
                    SequenceArr(l + 1) = SequenceArr(l)
                Next l
                SequenceArr(k) = nodeXML3(i).ChildNodes(j).Text
                x = x + 1
            'If the childnode is the description, do the same as for the sequence: Put the value in the array on the same place as the previous sequence number'
            ElseIf nodeXML3(i).ChildNodes(j).BaseName = "Desc" Then
                ReDim Preserve DescArr(x)
                For l = k To UBound(DescArr) - 1
                    DescArr(l + 1) = DescArr(l)
                Next l
                DescArr(k) = nodeXML3(i).ChildNodes(j).Text
            End If
        Next j
        'Put the array values in the right cells'
        For j = 0 To UBound(SequenceArr)
            osh.Range("C" & oRow).Value = osh.Range("C" & oRow).Value & " " & SequenceArr(j)
            osh.Range("D" & oRow).Value = osh.Range("D" & oRow).Value & " " & DescArr(j)
        Next j
        'Erase the arrays again for the next iteration'
        Erase SequenceArr
        Erase DescArr
    Next i
' Finish'
MsgBox "Process Completed"
End Sub

我不知道这是否是最快的方式,但我检查了它的确有效。我也希望你能理解我的意见。英语不是我的第一语言。