使用vbscript修改/优化XML文件或标签?

时间:2015-03-17 07:54:43

标签: xml vbscript xsd xml-parsing

我有一个XML文件,位于下面的格式:

<payments/>
    <payment>
        <payment_type>
        </payment_type>
        <dataforpay>
        </dataforpay>
        <allocations/>
            <allocation>
                <id>
                </id>
                <notfind>
                </notfind>
                <amount>
                </amount>
            </allocation>
    </payment>

因为它看起来非常先进的格式,我想使用vbscript将此格式更新到下面.....请提出建议?

<payments>
    <payment>
        <payment_type>
        </payment_type>
        <dataforpay>
        </dataforpay>
        <allocations>
            <allocation>
                <id>
                </id>
                <notfind>
                </notfind>
                <amount>
                </amount>
            </allocation>
        </allocations>
    </payment>
</payments>

2 个答案:

答案 0 :(得分:1)

尝试以下代码。它找到了自复式标签,其名称为复数形式,后面跟着一对或多对具有相同名称的开闭式标签,并用开 - 关对替换每个自闭标签,并移入其中以下所有匹配的标签。

sCont = ReadTextFile("C:\Test\src.xml", -2)

With CreateObject("VBScript.RegExp")
    .Global = False
    .MultiLine = True
    .IgnoreCase = True
    Do
        ' pattern to match a self-closing tag with name in the plural followed by one or more pair of open-close tags with the same name in the singular
        .Pattern = "^[\r\n]*(\s*)<(\w+)s(\s+[^>]*)*/\s*>(\s*[\r\n]+)(\s*<\2(?:\s+[^>]*)*>[\s\S]*?</\2>)"
        If Not .Test(sCont) Then Exit Do
        ' replace matched self-closing tag with open-close pair, and moves first matched following tag into it 
        sCont = .Replace(sCont, "$1<$2s$3>$4$5$4$1</$2s>")
        ' pattern to match a pair of open-close tags with name in the plural containing one or more, and followed by one or more pair of open-close tags with the same name in the singular
        .Pattern = "((?:^\s*)<(\w+)s(?:\s+[^>]*)*>\s*[\r\n]+\s*<\2(?:\s+[^>]*)*>[\s\S]*?</\2>\s*[\r\n]+)(^\s*</\2s>\s*[\r\n]+)(\s*<\2(?:\s+[^>]*)*>[\s\S]*?</\2>\s*[\r\n]+)"
        Do While .Test(sCont)
            ' move matched tag with the same name in the singular into
            sCont = .Replace(sCont, "$1$4$3")
        Loop
    Loop
End With

WriteTextFile sCont, "C:\Test\dst.xml", -2

Function ReadTextFile(sPath, iFormat)
    With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 1, False, iFormat)
        ReadTextFile = ""
        If Not .AtEndOfStream Then ReadTextFile = .ReadAll
        .Close
    End With
End Function

Sub WriteTextFile(sCont, sPath, iFormat)
    With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 2, True, iFormat)
        .Write(sCont)
        .Close
    End With
End Sub

使用RegExp disclaimer查看XHTML解析。

答案 1 :(得分:0)

原始格式不是有效的XML,因为它有2个根节点,所以我使用标准Scripting.FileSystemObject加载文件以防止XML解析错误出现任何错误

Option Explicit

dim fso: set fso = CreateObject("Scripting.FileSystemObject")
dim stream: set stream = fso.OpenTextFile("input.xml")
dim xml: xml = stream.ReadAll()
stream.close

为了操作XML,我将其加载到具有虚拟根节点的MSXML2.DomDocument中,以便它格式正确

dim xmldoc: set xmldoc = CreateObject("MSXML2.DomDocument")
xmldoc.setProperty "SelectionLanguage", "XPath"
xmldoc.async = false
if not xmldoc.loadXML("<root>" & xml & "</root>") then
    WScript.Echo xmldoc.parseError.reason
    WScript.Quit
end if

然后我使用XPath来查询payments节点(假设只有一个)和payment个节点(假设不止一个)

dim paymentsNode: set paymentsNode = xmldoc.selectSingleNode("//payments")
dim paymentNodes: set paymentNodes = xmldoc.selectNodes("//payment")

然后我遍历每个支付节点,然后查询allocations节点(假设只有一个)和allocation节点(假设多个节点)。每个allocation节点都从其父节​​点中删除,并添加到allocations节点。然后使用payment完成相同的操作。

dim p
for p = 0 to paymentNodes.length - 1
    dim payment: set payment = paymentNodes.Item(p)
    dim allocationsNode: set allocationsNode = payment.selectSingleNode("./allocations")
    dim allocationNodes: set allocationNodes = payment.selectNodes("./allocation")

    dim a
    for a = 0 to allocationNodes.length - 1
        dim allocation: set allocation = allocationNodes.Item(a)
        allocation.parentNode.removeChild allocation
        allocationsNode.appendChild allocation
    next

    payment.parentNode.removeChild payment
    paymentsNode.appendChild payment
next

因为payments节点现在是有效的根节点,所以我将payment级别的XML重新加载到xmldoc对象中,以便在保存之前删除我们的临时root节点到磁盘。

xmldoc.loadXML xmldoc.selectSingleNode("/root/payments").xml
xmldoc.save "output.xml"

直接节点操作的替代方法是使用XSL Transform,但同样,您需要更正根节点。如果您的输入XML文件很大,这可能是更好的选择。