使用VBA从XML获取属性名称

时间:2018-08-17 03:01:21

标签: xml excel-vba xmldom

我需要使用VBA从XML中获得不同的属性名称。

这是我的代码。

 sub test() 
 Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")

XMLFileName = "C:\Users\Input.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False

Dim mainnode As Object
Dim node As Object

Set mainnode = XMLFile.SelectNodes("//Elements")

For Each node In mainnode
    For Each child In node.ChildNodes
    Debug.Print child.BaseName
    Dim kiddo As Object
    For Each kiddo In child.ChildNodes
        Debug.Print kiddo.BaseName
    Next kiddo
Next child
Next node
End sub

这是示例XML。我需要从XML获取属性名称num

<Elements>
<Details>
    <Name>ABC</Name>
    <Address>123ABC</Address>
    <College>
        <collname>safasf</collname>
         <collnumber/>
    </College>
</Details>  
<Dept num="123">
    <Deptname>IT</Deptname>
    <ID>A123</ID>
 </Dept>            
</Elements>

预期结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 collnumber
 Dept
 num
 Deptname
 ID

以上代码的实际结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 Dept
 Deptname
 ID

我的代码未获取“ num”属性和<collnumber/>标记。有人可以让我知道如何使用VBA从XML中获取属性名称和标记名称

1 个答案:

答案 0 :(得分:3)

通过递归函数调用显示包含属性的XML结构

我的示例代码演示了一种方法

  • [1]使用XMLDOM方法和
  • 将整个XML结构分配给2维数组
  • [2](可选)将其写回到工作表。

放大提示:

我添加了这些►结构化提示,以提供比仅显示代码更多的帮助,因为我说过,这些要点中的很多也导致其他用户重复提问:

  • 尝试列出XML结构,但随着节点元素的层次深度增加(类型常量1 NODE_ELEMENT),您会失去良好的视野,因此,我紧急建议使用►递归调用
  • 此外,您可能还没有考虑节点文本(类型常量3 NODE_TEXT)的特殊构造,它是名称为第一个孩子父元素-cf主功能listChildNodes中A.和B.部分。 您遍历子节点的循环不会区分所提到的类型。只需研究引用函数中的注释以了解详细信息即可。
  • 我想您的XML文件以所需的处理指令开头,例如<?xml version="1.0" encoding="utf-8"?>,以便可以将其实际识别为XML文件。
  • 调用过程DisplayXML()使用 late绑定,而不是类似于您的帖子的早期绑定对MS XML的引用,但使用推荐的 MSXML2版本6.0 。它通过其DocumentElement <Elements> BTW是单个节点元素)和引用预定义2维数组v的第二个参数来调用main函数。 / li>
  • 版本::如果您将XMLFILE对象设置为使用Set XDoc = CreateObject("MSXML2.DOMDocument")进行存储,则通常会获得较旧的默认版本(3.0),因此在大多数情况下,最好使用改为显式使用Set XDoc = CreateObject("MSXML2.DOMDocument.6.0")(自动包括XPath)。
  • 如果您不使用 Load 函数取回True(文件成功加载)或False(加载错误),则返回不需要将文件名放在方括号()中。
  • 搜索字符串中的XPath运算符//将返回 any 级别的所有匹配项(参见OP中的XMLFile.SelectNodes("//Elements"))。
  • 还考虑使用 XSLT ,这是一种专用语言,旨在将XML文件转换为各种最终用途格式。

呼叫过程DisplayXML

提示:只需在调用过程中用估计的项数(例如1000)来估计数组的行数就足够了,因为主函数执行了ReDim (包括双重换位)(如果需要)。尽管如此,我还是从一开始就通过XPath / XMLDOM表达式XMLFile.SelectNodes("//*").Length在整个文件中添加了所有项目。

Option Explicit          ' declaration head of your code module

Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM")   ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")

XMLFileName = "C:\Users\Input.xml"                             ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML

If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
  Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
  listChildNodes XMLFile.DocumentElement, v                 ' call helper function

' [2] write results to sheet "Dump"                         ' change to your sheet name
  With ThisWorkbook.Worksheets("Dump")
       .Range("A:B") = ""                                   ' clear result range
       .Range("A1:B1") = Split("XML Tag,Node Value", ",")   ' titles
       .Range("A2").Resize(UBound(v), UBound(v, 2)) = v     ' get  2-dim info array
  End With
Else
       MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub

结构化结果显示在工作表中

提示:如果您不希望级别缩进或枚举级别层次结构,则可以轻松地调整下面的主要功能listChildNodes()

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   | 0 Elements          |                 |
+----+---------------------+-----------------+
|3   |   1 Details         |                 |
+----+---------------------+-----------------+
|4   |     2 Name          | ABC             |
+----+---------------------+-----------------+
|5   |     2 Address       | 123ABC          |
+----+---------------------+-----------------+
|6   |     2 College       |                 |
+----+---------------------+-----------------+
|7   |       3 collname    | safasf          |
+----+---------------------+-----------------+
|8   |       3 collnumber  |                 |
+----+---------------------+-----------------+
|9   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|10  |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|11  |     2 ID            | A123            |
+----+---------------------+-----------------+

还可以引用精确的节点元素,例如通过

listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1       ' starting from item no 1 and Level no 1

这将单独列出指示的节点集:

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|3   |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|4   |     2 ID            | A123            |
+----+---------------------+-----------------+

递归主要功能listChildNodes()

遍历子节点集合,此函数反复(“递归”)调用自身(即当前节点对象),并将整个XML结构分配给给定的2维数组(第2个参数)。此外,它还允许使用并指示层次结构级别。 请注意,此示例中的数组必须基于1。

如果项目计数器Edit 20/8 2018超出当前数组的上限(i,即在其第一维=,则

UBound(v)包括自动增加数组大小项目计数)。 技术说明:由于ReDim在较小(此处为第1维)上是不可能的,因此需要将“行”(第1维)更改为“列”(第2维)的中间换位。 / em>

Function listChildNodes(oCurrNode As Object, _
                        ByRef v As Variant, _
                        Optional ByRef i As Long = 1, _
                        Optional iLvl As Integer = 0 _
                        ) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author:  T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
'       (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
  If oCurrNode Is Nothing Then Exit Function
  If i < 1 Then i = 1                                       ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed 
  If i >= UBound(v) Then                                    ' change array size if needed
     Dim tmp As Variant
     tmp = Application.Transpose(v)                         ' change rows to columns
     ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000)      ' increase row numbers
     v = Application.Transpose(tmp)                         ' transpose back
     Erase tmp
  End If
  Const NAMEColumn& = 1, VALUEColumn& = 2                   ' constants for column 1 and 2
' Declare variables
  Dim oChildNode As Object                                  ' late bound node object
  Dim bDisplay   As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then                                 ' 3 ... NODE_TEXT
  ' display pure text content (NODE_TEXT) of parent elements
    v(i, VALUEColumn) = oCurrNode.Text                           ' nodeValue of text node
  ' return
    listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then                                ' 1 ... NODE_ELEMENT
   ' --------------------------------------------------------------
   ' B.1 NODE_ELEMENT WITHOUT text node immediately below,
   '     a) e.g. <Details> followed by node element <NAME>,
   '        (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
   '     b) e.g. <College> node element without any child node
   '     Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
   '           (see section A. getting the FirstChild of a NODE_ELEMENT)
   ' --------------------------------------------------------------
   ' a) display parent elements of other element nodes
     If oCurrNode.HasChildNodes Then
         If Not oCurrNode.FirstChild.NodeType = 3 Then             ' <>3 ... not a NODE_TEXT
            bDisplay = True
         End If
   ' b) always display empty node elements
     Else                                                           ' empty NODE_ELEMENT
            bDisplay = True
     End If
     If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
     End If

   ' --------------------------------------------------------------
   ' B.2 check child nodes
   ' --------------------------------------------------------------
     For Each oChildNode In oCurrNode.ChildNodes
      ' ~~~~~~~~~~~~~~~~~
      ' recursive call <<
      ' ~~~~~~~~~~~~~~~~~
        bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)

        If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
        End If
     Next oChildNode
   ' return
     listChildNodes = False

Else    ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
     If oCurrNode.NodeType = 8 Then   ' 8 ... NODE_COMMENT
        v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
        i = i + 1
     End If
   ' return
     listChildNodes = False
End If

End Function

'助手功能getAtts()

由上述函数调用的此辅助函数返回一个字符串,该字符串枚举给定节点的所有属性名称和值,类似于XPath表示法;该代码可以轻松地适应您的需求。

Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note:    called by above function listChildNodes()
' Author:  T.M.
  Dim sAtts$, ii&
  If node.Attributes.Length > 0 Then
      ii = 0: sAtts = ""
      For ii = 0 To node.Attributes.Length - 1
        sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
      Next ii
  End If
' return
  getAtts = sAtts
End Function