如何使用Python和ElementTree从XML文件的所有元素中提取所有内容?

时间:2019-04-17 11:00:06

标签: python xml elementtree

我有一个名为Artists.xml的以下XML文件,其中包含以下几位艺术家的信息:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Artists>
<Singer name="Britney">
    <Albums>7</Albums>
    <Country>USA</County>
    <Last Single>  Piece of Me
      <Year>2011</Year>
   </Last Single>
</Singer>
<Singer name="Justin">
    <Albums>8</Albums>
    <Country>USA</County>
    <Last Single> Rock Your Body
      <Year>2004</Year>
   </Last Single>
</Singer>
</Artsts>

我正在使用Python库ElementTree来提取所有标签的内容。到目前为止,这是我编写的Python代码:

from xml.etree import cElementTree as ET
tree = ET.parse('Artists.xml')
root = tree.getroot()
for child in root:
    for content in child:
       print(child[content].text)

尽管如此,当我运行脚本时,在控制台中看不到任何输入。我希望看到类似7 USA Piece of Me 2011, 8 USA Rock Your Body 2004.的内容,有人可以帮助我了解我在做什么错吗?预先感谢!

2 个答案:

答案 0 :(得分:0)

使用xml.etree.ElementTree

test.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Artists>
    <Singer name="Britney">
        <Albums>7</Albums>
        <Country>USA</Country>
        <LastSingle>
               Piece of Me
              <Year>2011</Year>
       </LastSingle>
    </Singer>
    <Singer name="Justin">
        <Albums>8</Albums>
        <Country>USA</Country>
        <LastSingle> Rock Your Body
          <Year>2004</Year>
       </LastSingle>
    </Singer>
</Artists>

因此

from xml.etree import ElementTree
tree = ElementTree.parse('test.xml')
root = tree.getroot()
results = root.findall('Singer')

for elem in results:
    for e in elem:
        print(e.text.strip())

输出

7
USA
Piece of Me
8
USA
Rock Your Body

Process finished with exit code 0

答案 1 :(得分:0)

一种通用方法。将XML转换为字典并打印字典。 (文件55726013.xml包含您的样本数据)。如您所见,代码对XML结构的知识为零。

import xmltodict
import json

with open('55726013.xml') as fd:
    doc = xmltodict.parse(fd.read())

print(json.dumps(doc, indent=4))

输出

{
    "Artists": {
        "Singer": [
            {
                "@name": "Britney", 
                "Albums": "7", 
                "Country": "USA", 
                "LastSingle": {
                    "Year": "2011", 
                    "#text": "Piece of Me"
                }
            }, 
            {
                "@name": "Justin", 
                "Albums": "8", 
                "Country": "USA", 
                "LastSingle": {
                    "Year": "2004", 
                    "#text": "Rock Your Body"
                }
            }
        ]
    }
}