Question

我需要添加一个新标记并写回XML。这是我的XML文件。

<?xml version="1.0" encoding="UTF-8"?>
    <!--Arbortext, Inc., 1988-2011, v.4002-->
    <!DOCTYPE reference-configuration-statement PUBLIC "-//Juniper Networks//DTD Jbook Software Guide//EN"
     "file:////cmsxml/IWServer/default/main/TechPubsWorkInProgress/STAGING/bin/dtds/jbook-sw/jbook-sw.dtd">
    <?Pub UDT _nopagebreak _touchup KeepsKeep="yes" KeepsPrev="no" KeepsNext="no" KeepsBoundary="page"?>
    <?Pub UDT _bookmark _target?>
    <?Pub UDT instructions _comment FontColor="red"?>
    <?Pub UDT instructions-DUPLICATE1 _comment FontColor="red"?>
    <?Pub UDT __target_1 _target?>
    <?Pub UDT __target_3 _target?>
    <?Pub UDT __target_2 _target?>
    <?Pub UDT _bookmark-DUPLICATE1 _target?>
    <?Pub UDT __target_4 _target?>
    <?Pub EntList copy trade micro reg plusmn deg middot mdash ndash nbsp
    caret cent check acute frac12 frac13 frac14 frac15 frac16 frac18 frac23
    frac25 frac34 frac35 frac38 frac45 frac56 frac58 frac78 ohm pi sup sup1
    sup2 sup3 rsquo?>
    <?Pub Inc?>
    <root topic-id="25775"

能够用etree完成任务。

path="C:/Users/pshahul/Desktop/Official/Automation/Write_XMl_files/Source/"
            add=(path, Filename)
            myfile=s.join(add)
            try:
                et = xml.etree.ElementTree.parse(myfile)
                tree=etree.parse(myfile)
                docinfo=tree.docinfo.encoding
                root=et.getroot()
                elem = root.find('cli-help')
                if elem is None:
                    new_tag=ET.Element("cli-help")
                    new_tag.text=final
                    root.insert(2,new_tag)
                    et.write(myfile,encoding=docinfo, xml_declaration=True)
                else:
                    elem.text=final
                    et.write(myfile,encoding=docinfo, xml_declaration=True)
            except OSError:
                pass
        else:
            raise TypeError
    except TypeError:
        continue

现在，我收到了DOCTYPE和XML声明，但跳过了以下内容。

<!--Arbortext, Inc., 1988-2011, v.4002-->
     <?Pub UDT _nopagebreak _touchup KeepsKeep="yes" KeepsPrev="no" KeepsNext="no" KeepsBoundary="page"?>
    <?Pub UDT _bookmark _target?>
    <?Pub UDT instructions _comment FontColor="red"?>
    <?Pub UDT instructions-DUPLICATE1 _comment FontColor="red"?>
    <?Pub UDT __target_1 _target?>
    <?Pub UDT __target_3 _target?>
    <?Pub UDT __target_2 _target?>
    <?Pub UDT _bookmark-DUPLICATE1 _target?>
    <?Pub UDT __target_4 _target?>
    <?Pub EntList copy trade micro reg plusmn deg middot mdash ndash nbsp
    caret cent check acute frac12 frac13 frac14 frac15 frac16 frac18 frac23
    frac25 frac34 frac35 frac38 frac45 frac56 frac58 frac78 ohm pi sup sup1
    sup2 sup3 rsquo?>
    <?Pub Inc?>

我如何保存？我需要在我的XML文件中返回这些行。加上评论。我发现这些评论也没有了。

Answer 1

如OP所建议的，这里的（或 a ）解决方案是利用lxml如下，它将保留注释以及处理指令：

import lxml.etree as ET
tree = ET.parse(filename)

Answer 2

documentation of ElementTree明确表明这是不可能的：

注意：并非XML输入的所有元素最终都将作为已解析树的元素。当前，该模块会跳过输入中的所有XML注释，处理指令和文档类型声明

开箱即用对我来说是很小的事情。除非文档很大，否则可以将其保存在内存中

from xml.dom import minidom
from xml.dom import Node

xml_string = "<?xml version='1.0'?><!--comment--><root><!--inside comment--><child/></root>"
xml_doc = minidom.parseString(xml_string)
for node in xml_doc.getchildNodes:
    if node.nodeType == Node.COMMENT_NODE:
        print("Comment", node.data)

保留在根元素

2 个答案: