使用pythton xml获取特定子值的属性

时间:2018-10-08 09:06:47

标签: python xml

如何获取“实验室符号”的“组名称” 457的“源”值

<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />

  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />

      </group>

  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />

      </group>


</tabgroup1>

我使用Python: access nested children in xml file parsed with ElementTree中的步骤进行了测试,但输出显示了所有属性。

1 个答案:

答案 0 :(得分:1)

您可以使用BeautifulSoup

data = '''<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />
  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />
      </group>
  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />
      </group>
</tabgroup1>'''

soup = BeautifulSoup(data, 'lxml')

result = [j['sources'] for i in soup.find_all('group', {'name': '457'}) for j in i.find_all('lab')]
result
#['SBS,TCS,DDT', 'RRT,QQR,XXC']

使用xml可以做到这一点:

import xml.etree.ElementTree as ET

data = '''<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />
  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />
      </group>
  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />
      </group>
</tabgroup1>'''

tree = ET.fromstring(data)
result = [j.attrib['sources'] for i in tree.findall('group') if i.attrib['name'] == '457' for j in i.findall('lab')]
result
#['SBS,TCS,DDT', 'RRT,QQR,XXC']