找到答案question,但答案绕过了“偏移”部分。
我的代码(使用from xml.etree.ElementTree import iterparse
):
def myparse(self)
context = iterparse(self.source, events=("start", "end", "start-ns"))
self.root = None
for event, elem in context:
if event == "start-ns" and elem[0] == "":
self.uri = elem[1]
continue
if event == "start" and self.root is None:
self.root = elem
continue
if event == "end" and elem.tag == "{%s}page" % self.uri:
thetext = revision.findtext("{%s}text" % self.uri)
theposition = 0#TODO: find the position if possible
yield thetext, theposition
elem.clear()
self.root.clear()
我在问是否有expat的GetCurrentByteIndex
之类的东西。
所以:
在使用xml.etree.ElementTree.iterparse进行迭代时,如何在元素的输入字符串或文件中查找偏移量(开始偏移量或结束偏移量,或两者都有)?