在将XML添加到另一个文件之前,从XML文档中删除根标记

时间:2013-05-02 09:33:17

标签: groovy xml-parsing

我正在尝试将几个XML代码块从一个文件添加到另一个文件中。问题是,这些块中的一些具有根标签,不需要将其复制到目标XML文件中(如果根标签等于预定义的父标签,则会出现这种情况)。这是我目前用来插入代码片段的代码(用Groovy编写):

if (addCode.nodeName == parentTags) { //meaning the root tags shouldn't be included
    for (org.w3c.dom.Node n : addCode.childNodes) {
        //parent is a NodeList
        parent.item(parent.length - 1).appendChild(document.importNode(n, true))
    }
} else {
    parent.item(parent.length - 1).appendChild(document.importNode(addCode, true))
}

解析XML:

Document parseWithoutDTD(Reader r, boolean validating = false, boolean namespaceAware = true) {
    FactorySupport.createDocumentBuilderFactory().with { f ->
        f.namespaceAware = namespaceAware
        f.validating = validating
        f.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
        f.newDocumentBuilder().with { db ->
            db.parse(new InputSource(r))
        }
    }
}

这是一个示例XML文件,其中不应包含根标记:

<catalogue> <!-- shouldn't be included -->
    <message key='type_issuedate'>Date Issued</message>
    <message key='type_accessioneddate'>Date Accesioned</message>
</catalogue>

您可能已经注意到了这个问题:如果我从XML文件中删除根标记以复制到另一个XML文件中,它们会抛出一个解析异常。

编辑:这是要插入的文件的一个(缩短的)示例:

<catalogue xml:lang="en" xmlns:i18n="http://apache.org/cocoon/i18n/2.1">
    ...
    <message key="column4">Date</message>
    <message key="column5">Summary</message>
    <message key="column6">Actions</message>
    <message key="restore">Restore</message>
    <message key="update">Update</message>
    <!-- INSERT XML HERE -->
    ...
</catalogue>

要包含根标记的XML示例(以及要插入的相应文件):

要插入的XML

<dependency>
    <groupId>grID</groupId>
    <artifactId>artID</artifactId>
    <version>${version.number}</version>
</dependency>

要插入的XML文件

<?xml version="1.0" encoding="UTF-8"?>
<project>
  <dependencies>
    <dependency>
        <groupId>grID1</groupId>
        <artifactId>artID1</artifactId>
        <type>jar</type>
        <classifier>classes</classifier>
    </dependency>
    <!-- INSERT XML HERE -->
  </dependencies>
</project>

目前,所有这些代码都无法运行,因为我希望它能够正常工作。有人可以帮助我吗?

非常感谢!

1 个答案:

答案 0 :(得分:0)

我认为(如果我理解的话),你需要这样的东西:

def insert( parent, data ) {
  if( parent.name() == data.name() ) {
    data.children().each {
      parent.append it
    }
  }
  else {
    parent.append data
  }
}

所以,给定

def newdoc = '''<dependency>
               |    <groupId>grID</groupId>
               |    <artifactId>artID</artifactId>
               |    <version>${version.number}</version>
               |</dependency>'''.stripMargin()

def doc = '''<?xml version="1.0" encoding="UTF-8"?>
            |<project>
            |  <dependencies>
            |    <dependency>
            |        <groupId>grID1</groupId>
            |        <artifactId>artID1</artifactId>
            |        <type>jar</type>
            |        <classifier>classes</classifier>
            |    </dependency>
            |  </dependencies>
            |</project>'''.stripMargin()

def docnode = new XmlParser().parseText( doc )
def newnode = new XmlParser().parseText( newdoc )

// use head() as I want to add to the first dependencies node
insert( docnode.dependencies.head(), newnode )
println groovy.xml.XmlUtil.serialize( docnode )

你得到了输出:

<?xml version="1.0" encoding="UTF-8"?><project>
  <dependencies>
    <dependency>
      <groupId>grID1</groupId>
      <artifactId>artID1</artifactId>
      <type>jar</type>
      <classifier>classes</classifier>
    </dependency>
    <dependency>
      <groupId>grID</groupId>
      <artifactId>artID</artifactId>
      <version>${version.number}</version>
    </dependency>
  </dependencies>
</project>

并给出:

def newdoc = '''<catalogue>
               |    <message key='type_issuedate'>Date Issued</message>
               |    <message key='type_accessioneddate'>Date Accesioned</message>
               |</catalogue>'''.stripMargin()

def doc = '''<catalogue xml:lang="en" xmlns:i18n="http://apache.org/cocoon/i18n/2.1">
            |    <message key="column4">Date</message>
            |    <message key="column5">Summary</message>
            |    <message key="column6">Actions</message>
            |    <message key="restore">Restore</message>
            |    <message key="update">Update</message>
            |</catalogue>'''.stripMargin()

def docnode = new XmlParser().parseText( doc )
def newnode = new XmlParser().parseText( newdoc )

insert( docnode, newnode )
println groovy.xml.XmlUtil.serialize( docnode )

你得到:

<?xml version="1.0" encoding="UTF-8"?><catalogue xml:lang="en" xmlns:xml="http://www.w3.org/XML/1998/namespace">
  <message key="column4">Date</message>
  <message key="column5">Summary</message>
  <message key="column6">Actions</message>
  <message key="restore">Restore</message>
  <message key="update">Update</message>
  <message key="type_issuedate">Date Issued</message>
  <message key="type_accessioneddate">Date Accesioned</message>
</catalogue>

修改

好的,鉴于额外的信息,这有帮助吗?给定与上面相同的newdocdoc字符串,此脚本似乎可以执行您想要的操作...

import groovy.xml.*
import groovy.xml.dom.*
import org.w3c.dom.Document;
import org.xml.sax.InputSource;

Document parseWithoutDTD(Reader r, boolean validating = false, boolean namespaceAware = true) {
    FactorySupport.createDocumentBuilderFactory().with { f ->
        f.namespaceAware = namespaceAware
        f.validating = validating
        f.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
        f.newDocumentBuilder().with { db ->
            db.parse(new InputSource(r))
        }
    }
}

def addCode  = parseWithoutDTD( new StringReader( newdoc ) ).documentElement
def document = parseWithoutDTD( new StringReader( doc ) )
def parent   = document.documentElement
def parentTags = 'catalogue'

use( DOMCategory ) {
  if( addCode.nodeName == parentTags ) {
    addCode.childNodes.each { node ->
      parent.appendChild( document.importNode( node, true ) )
    }
  }
  else {
    parent.appendChild( document.importNode( addCode, true ) )
  }
}