XML文件输出仅显示字节顺序标记

时间:2014-05-23 19:24:15

标签: xml vba dom byte-order-mark

我有一个我试图解析的XML文件,其内容完全是下面的XML:

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
     <Item>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>
</Results>

我一直试图让下面的代码(最初来自我的question here)工作。其他几个用户(包括代码的创建者)已经能够成功使用它,但是当我运行它时,输出文件只是ÿþ<。我确保将文件编码为ANSI并立即将记事本文件保存为.xml,但输出仍然只有字节顺序标记。

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim entity As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "PATH"

'# Define an output path for where to put the modified XML
newFilePath = "NEWPATH"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendEntity DOM, "Item", entity
'# Call a subroutine to append the entity to "AnotherItem" tags
AppendEntity DOM, "AnotherItem", entity

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
fso.CreateTextFile(newFilePath, True, True).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set entity = Nothing

End Sub

Sub AppendEntity(DOM As Object, tagName As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(tagName)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

可以肯定的是,代码不会产生任何错误 - 它会创建一个新文件,但它创建的文件不正确。正确的输出应该是(对于其他一些尝试过此代码的人):

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
    <Item>
      <entityId>012345</entityId>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <entityId>012345</entityId>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>

即,代码将每个标记作为子节点插入。在我想要将此代码应用到实际XML文档之后,我可以使用它来处理这个示例,XML大致相同,但包含多个实体。例如:

   <Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Reference>{REFERENCE-HERE}</Reference>
      <FillerTags>Filler</FillerTags>
      <entity>
        <entityName>ABC</entityName>
        <entityId>012345</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>012345</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>012345</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>
     <entity>
      <entityName>DEF</entityName>
        <entityId>678910</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>678910</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>678910</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>   

非常感谢您解决此问题的任何帮助。

更新:

上面的代码现在可以将行fso.CreateTextFile(newFilePath, True, True).Write DOM.XML更改为fso.CreateTextFile(newFilePath, True, False).Write DOM.XML

我现在正尝试在更大的XML数据集上运行此操作,但在行集entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

上收到错误

我在示例文件上发生了几次这个错误并意识到我忘了设置正确的目录,但是尽管这次错误仍然存​​在,但是错误仍然存​​在。

更新2:我收到此错误的代码已修改如下。如果我没弄错的话,我所做的只是重命名,但我可能错了。

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim Customer As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "C:\FAKEPATH\Final_Test.xml"

'# Define an output path for where to put the modified XML
newFilePath = "C:\FAKEPATH\Final_Test1.xml"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument.6.0")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendCustomer DOM, "Transaction", Customer

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
'MsgBox DOM.XML
fso.CreateTextFile(newFilePath, True, False).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set Customer = Nothing

End Sub

Sub AppendCustomer(DOM As Object, Transaction As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(Transaction)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

更新3:现在一切都很完美。唯一的问题是上述代码所追求的实际程序。由于存在多个实体,并且每组项都属于实体,因此代码需要找到entityId并将此entityId应用于在另一次出现entityId标记之前出现的所有项。在此之后,一切都会重演。

1 个答案:

答案 0 :(得分:3)

我最初把它放在这里作为答案,所以我可以清楚地显示我的代码。如果这也失败,将删除。尝试使用此语法来使用编写文件的替代方法。 Notepadd ++告诉我这是ANSII:

'## Create an FSO to write the new file'
Set fso = CreateObject("Scripting.FileSystemObject")


Dim FF As Integer
FF = FreeFile
'## Attempt to write the new/modified XML to file'
fso.CreateTextFile newFilePath
Open newFilePath For Output As FF
Print #FF, dom.XML
Close #FF

<强>替代地

(再次,只是覆盖基础,如果需要将更新或删除)

尝试:

fso.CreateTextFile(newFilePath, True, False).Write DOM.XML

区别在于CreateTextFile方法中的第三个参数指定是将文件创建为Unicode(True)还是ASCII(False)。

Notepad ++确认此方法是ANSII,而如果我True创建Unicode文件,我会得到一个UCS-2 Little Endian文件。

我个人注意到Ascii / Unicode之间没有任何区别 - 我可以在记事本或Notepad ++中打开它们对我来说看起来一样,但是因为这看起来像是一个字符编码问题,所以值得一试。我建议它只作为实现的第一个(也是最简单的)选项(如果需要,还有一些选项需要探索)。

更新#3

要解决文件的嵌套特性...基本上你有XML元素兄弟(“实体”和“项目”),你需要修改“项目”(和它的孩子)节点)包括“entityId”(它是“实体”的子节点)。我正在解释这种关系,希望这种修改是有意义的!

'##### NO LONGER USED:'
'# Get the entityID node'
'Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)'

Dim itm As IXMLDOMNode

'# Instead of getting the first item like we did before, we can iterate the collection'
' of nodes with the entityID tag like so:'
For Each Customer In DOM.DocumentElement.getElementsByTagName("entityId")
   'Since Item is Entity nextSibling, and Entity is parent of entityId,'
   ' we can iterate the collection if its childNodes like this:'
    For Each itm In Customer.ParentNode.NextSibling.ChildNodes
        If itm.HasChildNodes Then
            '# Insert this node before the first child node of Item'
            itm.InsertBefore Customer.CloneNode(True), itm.FirstChild
        Else
            '# Append this node to the Item'
            itm.appendChild Customer.CloneNode(True)
        End If
    Next
Next

'##### This function call is no longer needed
'AppendCustomer DOM, "Transaction", Customer'

这会生成如下的XML:

<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <Results>
        <Reference>{REFERENCE-HERE}</Reference>
        <FillerTags>Filler</FillerTags>
        <entity>
            <entityName>ABC</entityName>
            <entityId>012345</entityId>
        </entity>
        <Items>
            <Item>
                <entityId>012345</entityId>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
            </Item>
            <AnotherItem>
                <entityId>012345</entityId>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
            </AnotherItem>
        </Items>
    </Results>
    <Results>
        <Reference>{REFERENCE-HERE}</Reference>
        <FillerTags>Filler</FillerTags>
        <entity>
            <entityName>DEF</entityName>
            <entityId>54321</entityId>
        </entity>
        <Items>
            <Item>
                <entityId>54321</entityId>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
            </Item>
            <AnotherItem>
                <entityId>54321</entityId>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
            </AnotherItem>
        </Items>
    </Results>
</root>