For Python使用Python解析XML

时间:2018-02-28 08:52:59

标签: python xml

XML文件

<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>

import xml.etree.ElementTree as et
import pandas as pd
import numpy as np

导入所有库

tree = et.parse("documents/pythonstore.xml")

我将此文件放在文件

root = tree.getroot()
for a in range(3):
  for b in range(4):
     new=root[a][b].text
     print (new)

打印出XML中的所有子项。

df=pd.DataFrame(columns=['name','description','cost','shipping'])

创建了一个数据框,用于存储XML中的所有子项

我的问题:

  • 如何将新变量转换为列表?我试过追加或列表功能,失败了。
  • 如何使用for循环将子项转换为数据框?

请有人帮帮我!非常感谢你!

1 个答案:

答案 0 :(得分:0)

这可能会有所帮助。

# -*- coding: utf-8 -*-
s = """<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>"""

import xml.etree.ElementTree as et
tree = et.fromstring(s)
root = tree
res = []
for a in range(3):
    r = []
    for b in range(4):
        new=root[a][b].text
        r.append(new)
    res.append(r)

print res
df=pd.DataFrame(res, columns=['name','description','cost','shipping'])
print df

<强>输出

[['Python Hoodie', 'This is a Hoodie', '$49.99', '$2.00'], ['Python shirt', 'This is a shirt', '$79.99', '$4.00'], ['Python cap', 'This is a cap', '$99.99', '$3.00']]

            name       description    cost shipping
0  Python Hoodie  This is a Hoodie  $49.99    $2.00
1   Python shirt   This is a shirt  $79.99    $4.00
2     Python cap     This is a cap  $99.99    $3.00