使用Python在MySQL中保存XML响应

时间:2019-03-26 17:49:38

标签: python xml python-3.x xml-parsing

我正在尝试使用python将xml响应保存到MySQL数据库中,但是我有点受阻,我在网上做了很多研究。由于我的Python编码知识还很有限,所以不确定为什么会得到奇怪的结果。

<site siteID="0404">
        <date dateValue="20190322">
            <traffic code="01" exits="0" enters="0" startTime="000000"/>
            <traffic code="01" exits="0" enters="0" startTime="010000"/>
            <traffic code="01" exits="0" enters="0" startTime="020000"/>
            <traffic code="01" exits="0" enters="0" startTime="030000"/>
            <traffic code="01" exits="0" enters="0" startTime="040000"/>
            <traffic code="01" exits="0" enters="0" startTime="050000"/>
            <traffic code="01" exits="0" enters="0" startTime="060000"/>
            <traffic code="01" exits="0" enters="0" startTime="070000"/>
            <traffic code="01" exits="0" enters="0" startTime="080000"/>
            <traffic code="01" exits="1" enters="2" startTime="090000"/>
            <traffic code="01" exits="17" enters="21" startTime="100000"/>
            <traffic code="01" exits="18" enters="16" startTime="110000"/>
            <traffic code="01" exits="20" enters="26" startTime="120000"/>
            <traffic code="01" exits="23" enters="25" startTime="130000"/>
            <traffic code="01" exits="13" enters="18" startTime="140000"/>
            <traffic code="01" exits="22" enters="21" startTime="150000"/>
            <traffic code="01" exits="26" enters="23" startTime="160000"/>
            <traffic code="01" exits="23" enters="22" startTime="170000"/>
            <traffic code="01" exits="21" enters="19" startTime="180000"/>
            <traffic code="01" exits="30" enters="35" startTime="190000"/>
            <traffic code="01" exits="9" enters="9" startTime="200000"/>
            <traffic code="01" exits="0" enters="0" startTime="210000"/>
            <traffic code="01" exits="0" enters="0" startTime="220000"/>
            <traffic code="01" exits="0" enters="0" startTime="230000"/>
        </date>
    </site>
    <site siteID="0406">
        <date dateValue="20190322">
            <traffic code="01" exits="0" enters="0" startTime="000000"/>
            <traffic code="01" exits="0" enters="0" startTime="010000"/>
            <traffic code="01" exits="0" enters="0" startTime="020000"/>
            <traffic code="01" exits="0" enters="0" startTime="030000"/>
            <traffic code="01" exits="0" enters="0" startTime="040000"/>
            <traffic code="01" exits="0" enters="0" startTime="050000"/>
            <traffic code="01" exits="0" enters="0" startTime="060000"/>
            <traffic code="01" exits="0" enters="0" startTime="070000"/>
            <traffic code="01" exits="0" enters="0" startTime="080000"/>
            <traffic code="01" exits="5" enters="8" startTime="090000"/>
            <traffic code="01" exits="24" enters="27" startTime="100000"/>
            <traffic code="01" exits="34" enters="35" startTime="110000"/>
            <traffic code="01" exits="22" enters="21" startTime="120000"/>
            <traffic code="01" exits="13" enters="12" startTime="130000"/>
            <traffic code="01" exits="40" enters="43" startTime="140000"/>
            <traffic code="01" exits="21" enters="15" startTime="150000"/>
            <traffic code="01" exits="18" enters="21" startTime="160000"/>
            <traffic code="01" exits="12" enters="11" startTime="170000"/>
            <traffic code="01" exits="12" enters="6" startTime="180000"/>
            <traffic code="01" exits="5" enters="7" startTime="190000"/>
            <traffic code="01" exits="6" enters="2" startTime="200000"/>
            <traffic code="01" exits="0" enters="0" startTime="210000"/>
            <traffic code="01" exits="0" enters="0" startTime="220000"/>
            <traffic code="01" exits="0" enters="0" startTime="230000"/>
        </date>
    </site>

所需结果为

siteID  dateValue   exits   enters  startTime
404     20190322     0        0      0000
404     20190322     0        0      10000
404     20190322     0        0      20000
404     20190322     0        0      30000
404     20190322     1        2      90000

我从什么开始,

import xml.etree.ElementTree as ET

parsedXML  = ET.parse('File/demo.xml')
root = parsedXML .getroot()

for demoxml in root.findall('site'):
    store = demoxml .get('siteID')
    date = demoxml .find('date')
    trafiic = demoxml .find('traffic')
    print(store, date, trafiic)

我得到了(结果)

0404 <Element 'date' at 0x00000194EBE12D18> None
0406 <Element 'date' at 0x00000194EBE15598> None
100 <Element 'date' at 0x00000194EBE15DB8> None
101 <Element 'date' at 0x00000194EBE1A638> None
102 <Element 'date' at 0x00000194EBE1AE58> None
105 <Element 'date' at 0x00000194EBE1E6D8> None
106 <Element 'date' at 0x00000194EBE1EEF8> None
200 <Element 'date' at 0x00000194EBE23778> None
201 <Element 'date' at 0x00000194EBE23F98> None
203 <Element 'date' at 0x00000194EBE26818> None
205 <Element 'date' at 0x00000194EBE26DB8> None
206 <Element 'date' at 0x00000194EBE06638> None
301 <Element 'date' at 0x00000194EBE06E58> None
302 <Element 'date' at 0x00000194EBE2E6D8> None
303 <Element 'date' at 0x00000194EBE2EEF8> None
305 <Element 'date' at 0x00000194EBE52778> None

任何人都可以告诉我我做错了什么,如何得到我想要的结果?我认为元素和属性有问题,但不确定。

非常感谢

1 个答案:

答案 0 :(得分:0)

XML解析器的结果是嵌套元素。因此,下面的最终工作代码使用三个for循环(访问站点,日期和流量水平数据)。 在将html标记添加到示例数据后,您可以通过打印root,root [0],root [0] [0]和list(root [0] [0])来查看层次结构,它们给出了html站点,日期和流量元素。然后,get方法将从指定的命名字段返回数据。或者,可以使用索引和字典键值。例如:root [0] .attrib ['siteID']获得siteID值“ 0404”,而root [1] .attrib ['siteID']获得“ 406”以及所有剩余数据。

import xml.etree.ElementTree as ET

parsedXML  = ET.parse('File/demo.xml')
root = parsedXML.getroot()enter code here

print('siteID \t dateValue \t exits \t enters \t startTime')

for demoxml in root.findall('site'):
    siteID = demoxml.get('siteID')
    for date in demoxml:
        dateValue = date.get('dateValue')
        for item in date:
            exits = item.get('exits')
            enters = item.get('enters')
            startTime = item.get('startTime')
            print('{} \t {} \t {} \t {} \t\t {}'.format(siteID, dateValue, exits, enters, startTime))