我在提取xml标签值时遇到麻烦

时间:2019-05-04 14:14:44

标签: python xml

问题陈述 我正在使用API​​。我已经将其响应数据保存到XML文件中,现在想从中提取数据。这个文件很大,里面有很多数据标签, 但是我想提取一些数据,并想为我正在处理的项目制作其json文件。

示例XML响应为:

    xmlns:meta="http://www.tomtom.com/service/tis/parkingprobabilities/metadata/1.1"
                      schemaVersion="1.1">
    <meta:metaData>
        <meta:creatorUUID>aaac93fc-ba74-102b-b5ef-00304891a58c</meta:creatorUUID>
        <meta:creationTimeUTC>2016-09-30T19:58:01</meta:creationTimeUTC>
        <meta:timeZone>Europe/Berlin</meta:timeZone>
        <meta:cityName>Berlin</meta:cityName>
        <meta:countryCode>DE</meta:countryCode>
        <meta:description>Example showing parking probability and search time profile</meta:description>
    </meta:metaData>
    <roadSegment>
        <uuid>00000000-069f-6d7a-017f-78b7f701185b</uuid>
        <parkingDataProfile>
            <dailyProfile>
                <weekdays>
                    <day>MON</day>
                    <day>TUE</day>
                    <day>WED</day>
                    <day>THU</day>
                    <day>FRI</day>
                </weekdays>
                <hourlyData>
                    <hourOfDay>0</hourOfDay>
                    <parkingProbability>0.10</parkingProbability>
                    <averageSearchTime>12</averageSearchTime>
                </hourlyData>
                <hourlyData>
                    <hourOfDay>1</hourOfDay>
                    <parkingProbability>0.10</parkingProbability>
                    <averageSearchTime>11</averageSearchTime>
                </hourlyData>
                <hourlyData>
                    <hourOfDay>2</hourOfDay>
                    <parkingProbability>0.10</parkingProbability>
                    <averageSearchTime>10</averageSearchTime>
                </hourlyData>
                <!-- usually contains more -->
                <!-- some time slots could be missing -->
                <hourlyData>
                    <hourOfDay>23</hourOfDay>
                    <parkingProbability>0.10</parkingProbability>
                    <averageSearchTime>9</averageSearchTime>
                </hourlyData>
            </dailyProfile>
            <!-- could contain more -->
        </parkingDataProfile>
     </roadSegment>
     <!-- many more -->
</parkingProbabilities>

预期输出:

每个每日配置文件节点

中的每小时数据标签的值

到目前为止,代码已尝试:

from xml.dom import minidom
mydoc = minidom.parse('data_file.xml')

hourly_data = mydoc.getElementsByTagName("hourlyData")
for data in hourly_data:
    print(data.nodeValue)

对不起,我犯了一个不寻常的错误。

输出: 我在屏幕上打印。

1 个答案:

答案 0 :(得分:2)

尝试一下。您需要获取实际的节点才能获取数据。

from xml.dom import minidom
mydoc = minidom.parse('data_file.xml')

hourly_data = mydoc.getElementsByTagName("hourlyData")
for data in hourly_data:
    print(data.getElementsByTagName("parkingProbability")[0].childNodes[0].data)
    print(data.getElementsByTagName("averageSearchTime")[0].childNodes[0].data)
    print(data.getElementsByTagName("hourOfDay")[0].childNodes[0].data)