使用元素树提取元素文字

时间:2019-01-19 19:11:35

标签: python elementtree

我有以下带有<Description>标签的XML,其中以下文本包含特殊字符。

<branch>
   <Description>
      Here are few steps to make these settings
      1)    Tools &lt;&lt; Internet options 2)  Click on General tab
   </Description>
</branch>

现在,当我尝试检索“描述”文本时,得到以下结果,该结果具有自动将&lt;转换为>的结果。 因此,代码及其结果如下。

代码-

from xml.etree import ElementTree as ET 
tree = ET.parse(inputFile) # copy the above xml into any file and pass the path to inputFile 

    root = tree.getroot()

    for description in root.iter('Description'):
        print(description.text) 

我需要像Description文本标记中那样的字符串文字。我们如何得到它?

预期-

Here are few steps to make these settings
          1)    Tools >> Internet options 2)    Click on General tab

1 个答案:

答案 0 :(得分:0)

您只需使用n='n'重新转义内容:

html.escape()

结果:

import html
from xml.etree import ElementTree as ET

tree = ET.parse('test.xml')
root = tree.getroot()

for description in root.iter('Description'):
    print(html.escape(description.text))