Question

我有一些看起来像这样的xml：

<topic>
    <restrictions>
        <restriction id="US"/>
        <restriction id="CA"/>
        <restriction id="EU"/>
    </restrictions>
</topic>
<topic>
    <restrictions>
        <restriction id="JP"/>
        <restriction id="AU"/>
        <restriction id="EU"/>
        <restriction id="US"/>
    </restrictions>
</topic>

具有相同模式的不同迭代。我已经在我的脚本中使用minidom用xml做其他一些事情。对于上面的例子，我需要得到以下结果：

[['US','CA','EU'],['JP','AU','EU','US']]

我尝试了不同的迭代，结果不正确。这是我的代码：

from xml.dom import minidom

xmldoc = minidom.parse(path_to_file)
itemlist = xmldoc.getElementsByTagName('restrictions')
itemlist2 = xmldoc.getElementsByTagName('restriction')


restrictions=[]

for x in itemlist:
    res=[]
    for s in itemlist2:
        res.append(s.attributes['id'].value)

    restrictions.append(res)

print(restrictions)

你能帮我正确地完成迭代吗？任何帮助表示赞赏。谢谢！

编辑：刚刚意识到可能发生的其他事情，我需要考虑以防万一。也可能会发生一个主题元素根本没有元素，当发生这种情况时，附加到列表中的值应该只是0.制作这个条件的简单方法是什么？

Answer 1

getElementsByTagName返回具有相应标记名称的所有元素。因此itemlist2包含XML中的所有restriction注释。在您的代码中，它将为每个['US','CA','EU','JP','AU','EU','US']节点添加所有这些节点restrictions。因此，您应该尝试在循环中分别为每个restriction节点获取restrictions个节点。

from xml.dom import minidom

xmldoc = minidom.parse(path_to_file)
restrictions=[]
topic_nodes = xmldoc.getElementsByTagName('topic')
for topic_node in topic_nodes:
  restrictions_nodes = topic_node.getElementsByTagName('restrictions')
  if not restrictions_nodes:
      restrictions.append(0)
      continue

  result = []
  for restrictions_node in restrictions_nodes:
      restriction_nodes = restrictions_node.getElementsByTagName('restriction')
      for restriction_node in restriction_nodes:
          result.append(restriction_node.attributes['id'].value)

  restrictions.append(result)

print(restrictions)

使用minidom迭代属性值

1 个答案: