我正在尝试使用python将xml响应保存到MySQL数据库中,但是我有点受阻,我在网上做了很多研究。由于我的Python编码知识还很有限,所以不确定为什么会得到奇怪的结果。
<site siteID="0404">
<date dateValue="20190322">
<traffic code="01" exits="0" enters="0" startTime="000000"/>
<traffic code="01" exits="0" enters="0" startTime="010000"/>
<traffic code="01" exits="0" enters="0" startTime="020000"/>
<traffic code="01" exits="0" enters="0" startTime="030000"/>
<traffic code="01" exits="0" enters="0" startTime="040000"/>
<traffic code="01" exits="0" enters="0" startTime="050000"/>
<traffic code="01" exits="0" enters="0" startTime="060000"/>
<traffic code="01" exits="0" enters="0" startTime="070000"/>
<traffic code="01" exits="0" enters="0" startTime="080000"/>
<traffic code="01" exits="1" enters="2" startTime="090000"/>
<traffic code="01" exits="17" enters="21" startTime="100000"/>
<traffic code="01" exits="18" enters="16" startTime="110000"/>
<traffic code="01" exits="20" enters="26" startTime="120000"/>
<traffic code="01" exits="23" enters="25" startTime="130000"/>
<traffic code="01" exits="13" enters="18" startTime="140000"/>
<traffic code="01" exits="22" enters="21" startTime="150000"/>
<traffic code="01" exits="26" enters="23" startTime="160000"/>
<traffic code="01" exits="23" enters="22" startTime="170000"/>
<traffic code="01" exits="21" enters="19" startTime="180000"/>
<traffic code="01" exits="30" enters="35" startTime="190000"/>
<traffic code="01" exits="9" enters="9" startTime="200000"/>
<traffic code="01" exits="0" enters="0" startTime="210000"/>
<traffic code="01" exits="0" enters="0" startTime="220000"/>
<traffic code="01" exits="0" enters="0" startTime="230000"/>
</date>
</site>
<site siteID="0406">
<date dateValue="20190322">
<traffic code="01" exits="0" enters="0" startTime="000000"/>
<traffic code="01" exits="0" enters="0" startTime="010000"/>
<traffic code="01" exits="0" enters="0" startTime="020000"/>
<traffic code="01" exits="0" enters="0" startTime="030000"/>
<traffic code="01" exits="0" enters="0" startTime="040000"/>
<traffic code="01" exits="0" enters="0" startTime="050000"/>
<traffic code="01" exits="0" enters="0" startTime="060000"/>
<traffic code="01" exits="0" enters="0" startTime="070000"/>
<traffic code="01" exits="0" enters="0" startTime="080000"/>
<traffic code="01" exits="5" enters="8" startTime="090000"/>
<traffic code="01" exits="24" enters="27" startTime="100000"/>
<traffic code="01" exits="34" enters="35" startTime="110000"/>
<traffic code="01" exits="22" enters="21" startTime="120000"/>
<traffic code="01" exits="13" enters="12" startTime="130000"/>
<traffic code="01" exits="40" enters="43" startTime="140000"/>
<traffic code="01" exits="21" enters="15" startTime="150000"/>
<traffic code="01" exits="18" enters="21" startTime="160000"/>
<traffic code="01" exits="12" enters="11" startTime="170000"/>
<traffic code="01" exits="12" enters="6" startTime="180000"/>
<traffic code="01" exits="5" enters="7" startTime="190000"/>
<traffic code="01" exits="6" enters="2" startTime="200000"/>
<traffic code="01" exits="0" enters="0" startTime="210000"/>
<traffic code="01" exits="0" enters="0" startTime="220000"/>
<traffic code="01" exits="0" enters="0" startTime="230000"/>
</date>
</site>
所需结果为
siteID dateValue exits enters startTime
404 20190322 0 0 0000
404 20190322 0 0 10000
404 20190322 0 0 20000
404 20190322 0 0 30000
404 20190322 1 2 90000
我从什么开始,
import xml.etree.ElementTree as ET
parsedXML = ET.parse('File/demo.xml')
root = parsedXML .getroot()
for demoxml in root.findall('site'):
store = demoxml .get('siteID')
date = demoxml .find('date')
trafiic = demoxml .find('traffic')
print(store, date, trafiic)
我得到了(结果)
0404 <Element 'date' at 0x00000194EBE12D18> None
0406 <Element 'date' at 0x00000194EBE15598> None
100 <Element 'date' at 0x00000194EBE15DB8> None
101 <Element 'date' at 0x00000194EBE1A638> None
102 <Element 'date' at 0x00000194EBE1AE58> None
105 <Element 'date' at 0x00000194EBE1E6D8> None
106 <Element 'date' at 0x00000194EBE1EEF8> None
200 <Element 'date' at 0x00000194EBE23778> None
201 <Element 'date' at 0x00000194EBE23F98> None
203 <Element 'date' at 0x00000194EBE26818> None
205 <Element 'date' at 0x00000194EBE26DB8> None
206 <Element 'date' at 0x00000194EBE06638> None
301 <Element 'date' at 0x00000194EBE06E58> None
302 <Element 'date' at 0x00000194EBE2E6D8> None
303 <Element 'date' at 0x00000194EBE2EEF8> None
305 <Element 'date' at 0x00000194EBE52778> None
任何人都可以告诉我我做错了什么,如何得到我想要的结果?我认为元素和属性有问题,但不确定。
非常感谢
答案 0 :(得分:0)
XML解析器的结果是嵌套元素。因此,下面的最终工作代码使用三个for循环(访问站点,日期和流量水平数据)。 在将html标记添加到示例数据后,您可以通过打印root,root [0],root [0] [0]和list(root [0] [0])来查看层次结构,它们给出了html站点,日期和流量元素。然后,get方法将从指定的命名字段返回数据。或者,可以使用索引和字典键值。例如:root [0] .attrib ['siteID']获得siteID值“ 0404”,而root [1] .attrib ['siteID']获得“ 406”以及所有剩余数据。
import xml.etree.ElementTree as ET
parsedXML = ET.parse('File/demo.xml')
root = parsedXML.getroot()enter code here
print('siteID \t dateValue \t exits \t enters \t startTime')
for demoxml in root.findall('site'):
siteID = demoxml.get('siteID')
for date in demoxml:
dateValue = date.get('dateValue')
for item in date:
exits = item.get('exits')
enters = item.get('enters')
startTime = item.get('startTime')
print('{} \t {} \t {} \t {} \t\t {}'.format(siteID, dateValue, exits, enters, startTime))