我想在xml字符串中转义未转义的数据 e.g。
string = "<tag parameter = "something">I want to escape these >, < and &</tag>"
到
"<tag parameter = "something">I want to escape these >, < and &</tag>"
在正则表达式中,我想办法匹配&amp;获取数据的开始和结束位置
exp = re.search(">.+?</", label)
# Get position of the data between tags
start = exp.start() + 1
end = exp.end() - 2
return label[ : start] + saxutils.escape(label[start : end]) + label[end : ]
但是在 re.search 中,我无法匹配确切的xml格式
答案 0 :(得分:3)
也许你应该考虑re.sub
:
>>> oldString = '<tag parameter = "something">I want to escape these >, < and &</tag>'
>>> newString = re.sub(r"(<tag.*?>)(.*?)</tag>", lambda m: m.group(1) + cgi.escape(m.group(2)) + "</tag>", oldString)
>>> print newString
<tag parameter = "something">I want to escape these >, < and &</tag>
我的警告是如果你有嵌套标签,正则表达式肯定会中断。见Why is it such a bad idea to parse XML with regex?