从XML解析名称/值对

时间:2017-07-31 21:07:36

标签: python xml python-3.x

我正在尝试从供应商提供的XML文件中提取帐户详细信息。

我有一家提供XML文件的供应商,如:

<Accounts>
  <Account>
    <AccountNumber>1234567</AccountNumber>
    <Balance>$200.00</Balance>
  </Account>
  <Account>
     ...
  </Account>
</Accounts>

我可以使用python轻松地解析它:

mytree = et.parse(xml_path)
myroot = mytree.getroot()

for acc in charges_root.findall('Account'):
    acctnum = acc.find('AccountNumber').text
    balance = acc.find('Balance').text
    print(acctnum, balance)

这样的输出:

1234567 $200.00

然而,另一家供应商提供的XML文件更像名称/值对,我不确定如何轻松访问该数据。它的工作方式与上面相同:

<Accounts>
  <Account>
    <field name='AccountNumber' value='1234567' />
    <field name='Balance' value='$200.00' />
  </Account>
  <Account>
     ...
  </Account>
</Accounts>

到目前为止,我已经有了这个,但希望能够轻松地分别访问这些值:

mytree = et.parse(xml_path)
myroot = mytree.getroot()

for field in myroot.findall('Account'):
    for line in field:
        print(line.attrib)

输出的内容如下:

{'name': 'AccountNumber', 'value': '1234567'}
{'name': 'Balance', 'value': '$200.00'}

所以我的问题是 - 我如何访问这些值并将它们分配给变量(基于name),以便我可以在脚本的其他地方使用它们,就像我在{{1第一个例子中有}和acctnum

4 个答案:

答案 0 :(得分:2)

在迭代时从dict填充新的数据结构(如field),而不是仅丢弃:

account_d = {}
for field in myroot.findall('Account'):
    for line in field:
        account_d[line.attrib['name']] = line.attrib['value']

    # account_d should now be:
    # { 'AccountNumber': '1234567', 'Balance': '$200.00' }

您也可以使用列表/元组列表:

account_a = []
for field in myroot.findall('Account'):
    for line in field:
        account_d.append(line.attrib['name'], line.attrib['value']) 

    # account_a should now be:
    # [('AccountNumber', '1234567'), ('Balance', '$200.00')]

答案 1 :(得分:1)

ElementTree 1.3能够定位具有特定属性的节点:

from xml.etree import ElementTree as et

data = '''\
<Accounts>
  <Account>
    <field name='AccountNumber' value='1234567' />
    <field name='Balance' value='$200.00' />
  </Account>
  <Account>
    <field name='AccountNumber' value='9999999' />
    <field name='Balance' value='$300.00' />
  </Account>
</Accounts>'''

tree = et.fromstring(data)

for acc in tree.iterfind('Account'):
    acctnum = acc.find("field[@name='AccountNumber']").attrib['value']
    balance = acc.find("field[@name='Balance']").attrib['value']
    print(acctnum,balance)
1234567 $200.00
9999999 $300.00

答案 2 :(得分:0)

您可以将所有Account元素的field属性收集到字典中,然后根据需要使用其中的信息来完成此操作:

accounts.xml 示例输入文件:

<?xml version="1.0"?>
<Accounts>
  <Account>
    <field name='AccountNumber' value='1234567' />
    <field name='Balance' value='$200.00' />
  </Account>
  <Account>
    <field name='AccountNumber' value='8901234' />
    <field name='Balance' value='$100.00' />
  </Account>
</Accounts>

代码:

import xml.etree.ElementTree as et

xml_path = 'accounts.xml'
mytree = et.parse(xml_path)
myroot = mytree.getroot()

for acct in myroot.findall('Account'):
    info = {field.attrib['name']: field.attrib['value']
                for field in acct.findall('field')}
    acctnum, balance = info['AccountNumber'], info['Balance']
    print(acctnum, balance)

结果:

1234567 $200.00
8901234 $100.00

答案 3 :(得分:0)

  

问题:如何访问这些值并将它们分配给变量(基于名称)

将所有帐户转换为Dict [field]的Dict [AccountNumber] 属性name成为dict密钥:

Accounts = {}
for account in root.findall('Account'):
    fields = {}
    for field in account.findall('field'):
        fields[field.attrib['name']] = field.attrib['value']

    print('{a[AccountNumber]} {a[Balance]}'.format(a=fields))
    Accounts[fields['AccountNumber']] = fields

print(Accounts)
  

<强>输出

1234567 $200.00
9999999 $300.00
{'9999999': {'AccountNumber': '9999999', 'Balance': '$300.00'}, '1234567': {'AccountNumber': '1234567', 'Balance': '$200.00'}}

使用Python测试:3.4.2