我想解析一个xml文件,然后通过删除所选元素来处理结果树。我的问题是删除一个元素会破坏迭代元素的循环。
考虑以下xml数据:
<results>
<group>
<a />
<b />
<c />
</group>
</results>
和代码:
import xml.etree.ElementTree as ET
def showGroup(group,s):
print(s + ' len=' + str(len(group)))
print('<group>' )
for e in group:
print(' <' + e.tag + '>')
print('</group>\n')
def processGroup(group):
for e in group:
if e.tag != 'a':
group.remove(e)
showGroup(group,'removed <' + e.tag + '>')
tree = ET.parse('x.xml')
root = tree.getroot()
for group in root:
processGroup(group)
我希望for循环按顺序处理元素<a>
,<b>
和<c>
。特别是:
<a>
不应删除任何元素<b>
应删除<b>
<c>
应删除<c>
我希望生成的树在<group>
(<a>
元素)中有一个元素,而len(group)将返回1.
相反,在处理<b>
之后,for循环决定已满足结束测试,并且它不处理元素<c>
。如果是,则<c>
将被删除。相反,我留下了一个元素为<a>
和<c>
的树,len(group)返回2.
更新:如果在删除元素后没有代码,那么丑陋的黑客会以某种效率为代价“修复”问题。但在我的真实程序中,修剪循环后会有很多代码。
for e in group:
if e.tag != 'a':
group.remove(e)
showGroup(group,'removed <' + e.tag + '>')
processGroup(group)
我假设如果for循环被中断,那么在开始时再次使用该组可能会解决问题。递归是一种整洁的方式 - 以重新处理已经检查但未被删除的所有元素为代价。
我对此解决方案不满意。
答案 0 :(得分:1)
问题是你正在从正在迭代的东西中删除元素,当你删除一个元素时,剩下的元素会被移位,所以你最终可能会删除不正确的元素:
一个简单的解决方案是迭代树的副本或使用颠倒:
副本:
<head>
<base href="https://polygit.org/polymer+1.5.0/components/">
<script src="webcomponentsjs/webcomponents-lite.min.js"></script>
<link rel="import" href="paper-dropdown-menu/paper-dropdown-menu.html">
<link rel="import" href="paper-menu/paper-menu.html">
<link rel="import" href="paper-item/paper-item.html">
<link rel="import" href="paper-input/paper-input.html">
<link rel="import" href="paper-button/paper-button.html">
<link rel="import" href="iron-form/iron-form.html">
</head>
<body>
<rsvp-form></rsvp-form>
<dom-module id="rsvp-form">
<template>
<form is="iron-form" id="rsvp" method="post" action="/api/rsvps">
<h2 class="page-title">RSVP</h2>
<div class='layout horizontal wrap'>
<paper-input label='First Name' class='flex' value='{{firstName}}' name="firstName" required></paper-input>
<paper-input label='Last Name' class='flex' value='{{lastName}}' name="lastName" required></paper-input>
</div>
<div class='layout horizontal flex'>
<paper-dropdown-menu label="Attendance" class='flex' name="attendance" required>
<paper-menu class="dropdown-content" selected='{{selectedIndex}}'>
<paper-item>I would love to attend!</paper-item>
<paper-item>I cannot attend.</paper-item>
</paper-menu>
</paper-dropdown-menu>
</div>
<paper-button id="submitButton" on-tap="submitRsvp" raised>Submit</paper-button>
</form>
</template>
<script>
HTMLImports.whenReady(function() {
Polymer({
is: 'rsvp-form',
properties: {
selectedIndex: {
type: Number,
observer: '_selectedIndexChanged'
},
firstName: {
type: String,
value: ''
},
lastName: {
type: String,
value: ''
},
attendance: {
type: String,
value: ''
}
},
listeners: {
'rsvp.iron-form-presubmit': '_presubmit',
'rsvp.iron-form-submit': '_submit',
'rsvp.iron-form-error': '_error',
'rsvp.iron-form-invalid': '_invalid',
},
_selectedIndexChanged: function(newIndex) {
if (newIndex === 0) {
this.absent = false;
} else if (newIndex === 1) {
this.absent = true;
}
this.attending = !this.absent;
},
submitRsvp: function(e) {
this.$.rsvp.submit();
},
_presubmit: function() {
// you could modify data here before it's sent
console.log('presubmit request', this.$.rsvp.request);
},
_submit: function() {
// data successfully submitted
console.log('submitted request', this.$.rsvp.request);
},
_error: function(e) {
// data failed to submit
console.log('submitted failed', this.$.rsvp.request, e.detail);
},
_invalid: function() {
// form input is invalid
console.log('input invalid (not submitted)');
}
});
});
</script>
</dom-module>
</body>
颠倒:
def processGroup(group):
# creates a shallow copy so we are removing from the original
# but iterating over a copy.
for e in group[:]:
if e.tag != 'a':
group.remove(e)
showGroup(group,'removed <' + e.tag + '>')
使用复制逻辑:
def processGroup(group):
# starts at the end, as the container shrinks.
# when an element is removed, we still see
# elements at the same position when we started out loop.
for e in reversed(group):
if e.tag != 'a':
group.remove(e)
showGroup(group,'removed <' + e.tag + '>')
您也可以使用In [7]: tree = ET.parse('test.xml')
In [8]: root = tree.getroot()
In [9]: for group in root:
...: processGroup(group)
...:
removed <b> len=2
<group>
<a>
<c>
</group>
removed <c> len=1
<group>
<a>
</group>
代替for循环:
ET.tostring