在构建XPath时,如何修改以下代码以忽略标记(表示标记开头和结尾的<和>字符)和标记内的属性?
下面是一个Python脚本,它将读取格式化的XML文档,然后从当前光标位置确定XPath:
def buildPath(view, selection):
path = ['']
lines = []
region = sublime.Region(0, selection.end())
for line in view.lines(region):
contents = view.substr(line)
lines.append(contents)
level = -1
spaces = re.compile('^\s+')
for line in lines:
space = spaces.findall(line)
current = len(space[0]) if len(space) else 0
node = re.sub(r'\s*<\??([\w.]:)?([\w\-.]+)(\s.)?>.*', r'\2', line)
if current == level:
path.pop()
path.append(node)
elif current > level:
path.append(node)
level = current
elif current < level:
path.pop()
level = current
return path
答案 0 :(得分:1)
获取lxml(pip install lxml
)的副本:
import lxml.etree
tree = lxml.etree.fromstring(xmlasstring)
tree.xpath('//node')