在Python中注释xml部分

时间:2015-10-20 17:32:24

标签: python xml

我有一个包含多个部分的XML文件,我需要对其中的两个部分进行评论。 该文件是这样的:

<web-app>
  <display-name>Web Application</display-name>
  <context-param>
      <param-name>defaultContext</param-name>
      <param-value>true</param-value>
  </context-param>
  <listener>
      <listener-class>MyListener</listener-class>
  </listener>
  <filter>
      <filter-name>Filter1</filter-name>
      <filter-class>filter.Filter1</filter-class>
      <init-param>
        <param-name>type</param-name>
        <param-value>JSP</param-value>
      </init-param>
  </filter>
  <filter>
      <filter-name>Filter2</filter-name>
      <filter-class>filter.Filter2</filter-class>
      <init-param>
        <param-name>type</param-name>
        <param-value>HTM</param-value>
      </init-param>
  </filter>
  <filter>
      <filter-name>Filter3</filter-name>
      <filter-class>filter.Filter3</filter-class>
  </filter>
</web-app>

在这个例子中,我需要注释Filter1和Filter3部分。但它可能是其中任何一个,而不是按特定顺序,所以我需要根据过滤器名称匹配要评论的好部分。 因此更新的文件将是:

<web-app>
  <display-name>Web Application</display-name>
  <context-param>
      <param-name>defaultContext</param-name>
      <param-value>true</param-value>
  </context-param>
  <listener>
      <listener-class>MyListener</listener-class>
  </listener>
  <!--filter>
      <filter-name>Filter1</filter-name>
      <filter-class>filter.Filter1</filter-class>
      <init-param>
        <param-name>type</param-name>
        <param-value>JSP</param-value>
      </init-param>
  </filter-->
  <filter>
      <filter-name>Filter2</filter-name>
      <filter-class>filter.Filter2</filter-class>
      <init-param>
        <param-name>type</param-name>
        <param-value>HTM</param-value>
      </init-param>
  </filter>
  <!--filter>
      <filter-name>Filter3</filter-name>
      <filter-class>filter.Filter3</filter-class>
  </filter-->
</web-app>

我已经开始检查xml.dom.minidom来执行此操作,但实际上我不知道如何精确定位Filter1和Filter3以及如何注释整个部分,包括这两个元素。 基本上我已经启动了这段代码:

from xml.dom import minidom

#Method to comment a node
def comment_node(node):
    comment = node.ownerDocument.createComment(node.toxml())
    node.parentNode.replaceChild(comment, node)
    return comment

#Parse the web.xml file
current_path = getcwd()
relative_file_path = r"webapp\WEB-INF\web.xml"
file_path = normpath(join(current_path, relative_file_path))
dom = minidom.parse(file_path)

#Search for filter sections
itemlist = dom.getElementsByTagName('filter-name')
for item in itemlist:
    if "Filter1" == item.nodeValue:
        #need to comment the whole node containing the filter-name

这就是我被困住的地方。 我应该搜索所有节点的“过滤器”,然后检查它们中的每一个是否包含良好的过滤器名称insead?

请注意我是Python的初学者,所以我甚至不知道我是否在这里选择了好的图书馆......

有人能帮助我考虑应用变更的好策略吗?

谢谢!

2 个答案:

答案 0 :(得分:1)

只需稍加修改

itemlist = dom.getElementsByTagName('filter-name')
for item in itemlist:
    if "Filter1" == item.childNodes[0].nodeValue:
        #need to comment the whole node containing the filter-name
        comment_node(item.parentNode)
print dom.toxml() # verif

答案 1 :(得分:0)

以防万一,这是我的代码的最终版本。我添加了写入xml文件,因为它需要手动完成(我在开始时认为API的方法有哪种指针,以便文件自动更新!):

from os import getcwd
from os.path import normpath, join
from xml.dom import minidom

#Script explanation to the user
print("This script updates the web.xml file")
print()

#Method to comment a node
def comment_node(node):
    comment = node.ownerDocument.createComment(node.toxml())
    node.parentNode.replaceChild(comment, node)

#Parse the web.xml file
current_path = getcwd()
relative_file_path = r"webapp\WEB-INF\web.xml"
file_path = normpath(join(current_path, relative_file_path))
dom = minidom.parse(file_path)

#Search for filter sections
itemlist = dom.getElementsByTagName('filter')
for item in itemlist:
    for sub_item in item.childNodes:
        if "filter-name" == sub_item.nodeName:
            if "Filter1" == sub_item.childNodes[0].nodeValue or "Filter3" == sub_item.childNodes[0].nodeValue:
                #Need to comment the whole node containing the filter-name
                comment_node(item)
                #Stop looping on all the sub items as we already found the filter-name node
                break

# Should you want to see the result
print("Resulting file:")
print(dom.toxml())

#Writing to the file
file = open(file_path, 'w')
dom.writexml(file)
file.close()

非常感谢@David Zemens和@djangoliv的宝贵帮助!

更新

@djangoliv建议的更新,谢谢!:

#itemlist = dom.getElementsByTagName('filter')
#for item in itemlist:
#   for sub_item in item.childNodes:
#       if "filter-name" == sub_item.nodeName:
#           if "Filter1" == sub_item.childNodes[0].nodeValue or "Filter3" == sub_item.childNodes[0].nodeValue:
#               #Need to comment the whole node containing the filter-name
#               comment_node(item)
#               #Stop looping on all the sub items as we already found the filter-name node
#               break
# more simple
itemlist = dom.getElementsByTagName('filter-name')
for item in itemlist:
    if item.childNodes[0].nodeValue in ["Filter1", "Filter3"]:
        comment_node(item.parentNode)
        break