Perl XML :: XPath在文档中添加了一堆垃圾

时间:2014-05-07 17:08:40

标签: xml perl xpath

我有一个我想通过XPATH更新的web.xml。我注意到正确修改了所需的元素,但是在文档的开头添加了一堆垃圾。我注意到,即使我没有修改任何元素,也只是解析和打印,我就会得到那些垃圾。

代码:

require Cwd;
use File::Temp qw/ tempfile tempdir/;
use lib 'menu/perl-modules/lib/site_perl';
use XML::XPath;
use XML::XPath::NodeSet;
#use strict;

$file = "/tmp/web.xml";
my $xp   = XML::XPath->new( filename => $file );
my $root = $xp->find('/')->get_nodelist;
#$xp->setNodeText( $xpath, $newValue );

open( XPATH_FILE, "> $file" );
foreach my $nodes ( $xp->find('/')->get_nodelist ) {
  print XPATH_FILE $nodes->toString;
}
close(XPATH_FILE);

输入文件:

<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
   <filter>
      <filter-name>LocaleFilter</filter-name>
      ....
</web-app>

输出:在文档开头大约有700行注释,看起来像引用的dtd或某种东西的某种扩展。我只包括前几行的可读性:

<!--
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.

Copyright 2000-2007 Sun Microsystems, Inc. All rights reserved.

The contents of this file are subject to the terms of either the GNU
General Public License Version 2 only ("GPL") or the Common Development
and Distribution License("CDDL") (collectively, the "License").  You
may not use this file except in compliance with the License. You can obtain
a copy of the License at https://glassfish.dev.java.net/public/CDDL+GPL.html
or glassfish/bootstrap/legal/LICENSE.txt.  See the License for the specific
language governing permissions and limitations under the License.

When distributing the software, include this License Header Notice in each
file and include the License file at glassfish/bootstrap/legal/LICENSE.txt.
Sun designates this particular file as subject to the "Classpath" exception
as provided by Sun in the GPL Version 2 section of the License file that
accompanied this code.  If applicable, add the following below the License
Header, with the fields enclosed by brackets [] replaced by your own
identifying information: "Portions Copyrighted [year]
[name of copyright owner]"

Contributor(s):

If you wish your version of this file to be governed by only the CDDL or
only the GPL Version 2, indicate your decision by adding "[Contributor]
elects to include this software in this distribution under the [CDDL or GPL
Version 2] license."  If you don't indicate a single choice of license, a
recipient has the option to distribute your version of this file under
either the CDDL, the GPL Version 2 or to extend the choice of license to
its licensees as provided above.  However, if you add GPL Version 2 code
and therefore, elected the GPL Version 2 license, then the option applies
only if the new code is made subject to such option by the copyright
holder.
--><!--
This is the XML DTD for the Servlet 2.3 deployment descriptor.

1 个答案:

答案 0 :(得分:2)

我不明白为什么这个模块会在所有链接的DTD文档中占用任何帐户,因为就我所见,它没有进行有效性检查。

此外,虽然模块允许更改和添加到文档的节点,但没有明显的方法来删除节点。

但是,要排除的注释是根节点的子项,因此可以通过在根节点的唯一元素子项上重新生成文档来有效删除它们。

此代码演示

use strict;
use warnings;
use autodie;
use 5.010;

use XML::XPath;

my $xp   = XML::XPath->new( ioref => *DATA );
my ($new_root) = $xp->findnodes('/*');

print $new_root->toString, "\n";

__DATA__
<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>

<强>输出

<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>