如何使用VBA将大型XML文件拆分为较小的文件

时间:2017-08-22 06:23:14

标签: xml vba excel-vba xml-parsing xmldom

您好以下是我正在使用的XML文件格式示例,我想将文件拆分为较小的部分。问题是有2个子节点" Header"和"记录"并且我无法使用" Header"保存XML文件。 &安培; "页脚"节点

<PortfolioBulk2_0_RES>
  <Header>
    <ProviderCode>123</ProviderCode>
    <FileID>20170817</FileID>
    <NumInputSubjects>23123</NumInputSubjects>
    <ChunkID>1</ChunkID>
    <RecordMin>1</RecordMin>
    <RecordMax>23123</RecordMax>
  </Header>
  <Record>
    <RecordId>1</RecordId>
    <ProviderSubjectNo>123456789</ProviderSubjectNo>
    <PackageLabel>GOLD</PackageLabel>
    <Error>
      <No>811</No>
      <Description>Subject not found</Description>
    </Error>
  </Record>
  <Record>
    <RecordId>2</RecordId>
    <ProviderSubjectNo>654789321</ProviderSubjectNo>
    <PackageLabel>GOLD</PackageLabel>
    <Error>
      <No>811</No>
      <Description>Subject not found</Description>
    </Error>
  </Record>
  <Footer>
    <StartDateTime>2008201712:18:06</StartDateTime>
    <StopDateTime>2008201717:19:00</StopDateTime>
    <NoIndividualsOK>13185</NoIndividualsOK>
    <NoCompaniesOK>546</NoCompaniesOK>
    <NoIndividualsError>282</NoIndividualsError>
    <NoCompaniesError>20</NoCompaniesError>
    <NoUnknownsError>9090</NoUnknownsError>
  </Footer>
</PortfolioBulk2_0_RES>

这是我在VBA中的代码。我希望是否有人可以建议如何循环以XML格式保存的每个文件中的页眉和页脚节点。感谢

Sub SPLIT()
numFiles = 2  'number of output files

Set src = CreateObject("Msxml2.DOMDocument.6.0")
src.async = False
src.Load "C:\Users\104704\Documents\Office 1\04_Raw Data\09_AECB Bulk\CLI working\01.xml"

Set Nodes = src.SelectNodes("//Record")
numnodes = Nodes.Length \ numFiles  'number of nodes per output file
'MsgBox (numnodes)

Set XML = Nothing
For i = 0 To Nodes.Length - 1
  'create a new XML object on the first iteration and every time numNodes
  'nodes have been added to the current object
  If i Mod numnodes = 0 Then
    If Not XML Is Nothing Then
      'if we already have an XML object: save it to a file
      Set prolog = XML.createProcessingInstruction("xml", "version='1.0'")
      XML.InsertBefore prolog, XML.ChildNodes(0)
      XML.Save "C:\Users\104704\Documents\Office 1\04_Raw Data\09_AECB Bulk\CLI working\" & (i \ numnodes - 1) & ".xml"
    End If
  Set XML = CreateObject("Msxml2.DOMDocument.6.0")
  Set root = XML.createElement("PortfolioBulk2_0_RES")

    XML.appendChild root

  End If

  root.appendChild Nodes.Item(i)
Next
'save unsaved XML object
If Not XML Is Nothing Then
  Set prolog = XML.createProcessingInstruction("xml", "version='1.0'")
  XML.InsertBefore prolog, XML.ChildNodes(0)
  XML.Save "C:\Users\104704\Documents\Office 1\04_Raw Data\09_AECB Bulk\CLI working\" & (i \ numnodes - 1) & ".xml"
End If
End Sub

2 个答案:

答案 0 :(得分:0)

@ O.Cheema,据我所知,您希望将现有的记录节点导出到新的xml中。下面的代码是纯VBA并使用xmldom:

>>> re.findall(r'\b[^\d\W]\w*\b', 'a+b*c=d', re.U) 
['a', 'b', 'c', 'd']

答案 1 :(得分:-2)

尝试使用xml linq

Imports System.Xml
Imports System.Xml.Linq
Imports System.IO
Module Module1
    Const INPUT_FILE As String = "c:\temp\test.xml"
    Const OUTPUT_FILE As String = "c:\temp\test1.xml"
    Sub Main()
        Dim doc As XDocument = XDocument.Load(INPUT_FILE)
        Dim headersAndFooters As List(Of XElement) = doc.Descendants().Where(Function(x) (x.Name.LocalName = "Header") Or (x.Name.LocalName = "Footer")).ToList()

        Dim writer As New StreamWriter(OUTPUT_FILE)

        For Each headerAndFooter In headersAndFooters
            writer.WriteLine(headerAndFooter.ToString())
        Next
        writer.Flush()
        writer.Close()
    End Sub

End Module