Question

我有这个xml字符串

<aof xmlns="http://tsng.jun.net/jppos/conig/hello"><num>3</num><desc>addy02</desc><tpcs>5</tpcs></aof>'

我需要使用正则表达式提取5。

我所做的是：

regex = re.compile(r'tag+</.+>\s*(.+)\s*<.+>')

标签是'tpcs' 但它返回空标签。

有人可以帮忙。

Answer 1

Don't use regexps for XML / HTML！ Read this，投票最多的人之一本网站排名最高的答案！

改为使用XPath：

//tpcs/text()

或（namespace-gnostic）：

//*[local-name()='tpcs']/text()

将按预期打印5。

Answer 2

正如评论中所述，这个正则表达式可以解决问题：

(?<=<tpcs>).*?(?=<\/tpcs>)

如this demo中所示。

说明：