有没有什么方法可以通过使用元素树从xml文件中创建多个数据框?

时间:2019-11-11 07:32:39

标签: python xml list elementtree

<root>  
 <person age="18">  
    <name>hzj</name>  
    <sex>man</sex>  
 </person>  
 <person age="19" des="hello">  
    <name>kiki</name>  
    <sex>female</sex>  
 </person>  
</root>
list=[]
for node in xroot.findall('./root/person'):
    name = node.attrib.get('name')
    sex = node.find('sex').text
    df=pd.DataFrame(columns=['person','name','sex']
list.append(df)

我希望将18岁的相关数据作为一个数据框。 19岁是另一个数据框。然后将这两个数据帧放在一个列表中。

2 个答案:

答案 0 :(得分:0)

尝试这个。.假设您的xml文件是file.xml。

import pandas as pd
import xml.etree.ElementTree as et 

xtree = et.parse("file.xml")
xroot = xtree.getroot()
dfar = {}
for node in xroot:
    age = node.attrib.get("age")
    dfar[f"{age}_df"]=pd.DataFrame()

for node in xroot:

    o_age = node.attrib.get("age")
    o_name = node.find("name").text
    o_sex = node.find("sex").text
    row = {"name":o_name,"sex":o_sex}
    df = dfar.get(f"{o_age}_df")
    dfar[f"{o_age}_df"] = df.append([row],ignore_index=True)


flist = list(dfar.items())
for i in flist:
    age = i[0]  
    df_of_age = i[1]
    df_of_age.to_csv(f"{age}.csv")

答案 1 :(得分:0)

类似这样的东西

import pandas as pd
import xml.etree.ElementTree as ET

xml = '''<root>  
 <person age="18">  
    <name>hzj</name>  
    <sex>man</sex>  
 </person>  
 <person age="19" des="hello">  
    <name>kiki</name>  
    <sex>female</sex>  
 </person>  
 <person age="19" des="hi">  
    <name>jane</name>  
    <sex>female</sex>  
 </person>   
</root>'''

data_frames = {}

root = ET.fromstring(xml)
for person in root.findall('.//person'):
  age = person.attrib['age']
  df = data_frames.get(age,None)
  if df is None:
    data_frames[age] = pd.DataFrame()
  data_frames[age]= data_frames[age].append({'name':person.find('./name').text,'sex':person.find('./sex').text},ignore_index=True)

for age,df in data_frames.items():
  print('{} --> {}'.format(age,df.head())) 

输出

18 -->   name  sex
0  hzj  man
19 -->    name     sex
0  kiki  female
1  jane  female