我正在尝试从文件中提取一些数据:
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<d2LogicalModel xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datex2.eu/schema/2/2_0" modelBaseVersion="2">
<exchange>
<supplierIdentification>
<country>nl</country>
<nationalIdentifier>NDW-CNS</nationalIdentifier>
</supplierIdentification>
</exchange>
<payloadPublication xsi:type="MeasuredDataPublication" lang="nl">
<publicationTime>2014-12-04T06:59:55.000Z</publicationTime>
<publicationCreator>
<country>nl</country>
<nationalIdentifier>NDW-CNS</nationalIdentifier>
</publicationCreator>
<measurementSiteTableReference id="NDW01_MT" version="662" targetClass="MeasurementSiteTable"/>
<headerInformation>
<confidentiality>noRestriction</confidentiality>
<informationStatus>real</informationStatus>
</headerInformation>
<siteMeasurements>
<measurementSiteReference id="GEO03_D4T-RWS_T_0317_ID_324" version="3" targetClass="MeasurementSiteRecord"/>
<measurementTimeDefault>2014-12-04T06:58:00Z</measurementTimeDefault>
<measuredValue index="1">
<measuredValue>
<basicData xsi:type="TravelTimeData">
<travelTimeType>best</travelTimeType>
<travelTime numberOfInputValuesUsed="100" standardDeviation="7">
<duration>34</duration>
</travelTime>
</basicData>
</measuredValue>
</measuredValue>
</siteMeasurements>
<siteMeasurements>
<measurementSiteReference id="GEO01_Z_RWSTRN054" version="1" targetClass="MeasurementSiteRecord"/>
<measurementTimeDefault>2014-12-04T06:58:00Z</measurementTimeDefault>
<measuredValue index="1" xsi:type="_SiteMeasurementsIndexMeasuredValue">
<measuredValue xsi:type="MeasuredValue">
<basicData xsi:type="TravelTimeData">
<travelTimeType>best</travelTimeType>
<travelTime numberOfIncompleteInputs="0" numberOfInputValuesUsed="7" standardDeviation="0.71" supplierCalculatedDataQuality="100.0">
<duration>56</duration>
</travelTime>
</basicData>
</measuredValue>
</measuredValue>
</siteMeasurements>
.
.
.
.
.
<siteMeasurements>
<measurementSiteReference id="RWS01_MONIBAS_0091hrr0350ra0" version="1" targetClass="MeasurementSiteRecord"/>
<measurementTimeDefault>2014-12-04T06:58:00Z</measurementTimeDefault>
<measuredValue index="1" xsi:type="_SiteMeasurementsIndexMeasuredValue">
<measuredValue xsi:type="MeasuredValue">
<basicData xsi:type="TravelTimeData">
<travelTimeType>best</travelTimeType>
<travelTime numberOfIncompleteInputs="0">
<duration>23</duration>
</travelTime>
</basicData>
</measuredValue>
</measuredValue>
</siteMeasurements>
</payloadPublication>
</d2LogicalModel>
</soap:Body>
我想要做的是使用Python从每个
中提取 <siteMeasurements>
<measurementSiteReference id="RWS01_MONIBAS_0091hrr0350ra0" version="1" targetClass="MeasurementSiteRecord"/>
<measurementTimeDefault>2014-12-04T06:58:00Z</measurementTimeDefault>
<measuredValue index="1" xsi:type="_SiteMeasurementsIndexMeasuredValue">
<measuredValue xsi:type="MeasuredValue">
<basicData xsi:type="TravelTimeData">
<travelTimeType>best</travelTimeType>
<travelTime numberOfIncompleteInputs="0">
<duration>23</duration>
</travelTime>
</basicData>
</measuredValue>
</measuredValue>
</siteMeasurements>
来自'measurementSiteReference'的属性'id'的值和'duration'的文本内容
我正在使用Python。我的代码到目前为止:
import xml.etree.cElementTree as ET
tree = ET.ElementTree(file='track.xml')
root = tree.getroot()
for elem in tree.iter():
print elem.tag, elem.attrib
但是我在提取这些值时遇到了困难。我对Python没有任何经验。
如何迭代'siteMeasurements'并获取measurementSiteTableReference的'id'属性值和'duration'的文本内容
请给我一些建议,帮助我上路
答案 0 :(得分:1)
您可能在</soap:Envelope>
文件底部缺少xml
标记,或者您可能没有粘贴副本。
无论如何,在将标记放入并在顶部(第1行)添加以下xml
标记后,我能够运行它。
<?xml version="1.0" encoding="UTF-8"?>
首先,我们需要弄清楚我们可以使用哪些元素。
>>> for i in root.iter():
print i
其中列出如下(截断)
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x29e4170>
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Body' at 0x29e4190>
|
|
<Element '{http://datex2.eu/schema/2/2_0}measurementSiteTableReference' at 0x29e4510>
|
|
<Element '{http://datex2.eu/schema/2/2_0}duration' at 0x29e4750>
一旦我们拥有了这些元素,我们就会简单地通过所需的元素来获取它们的键/值对。
<强>代码强>
import xml.etree.ElementTree as ET
data_file = 'soapData2.xml'
tree = ET.parse(data_file)
root = tree.getroot()
t1 = "{http://datex2.eu/schema/2/2_0}measurementSiteReference"
t2 = "{http://datex2.eu/schema/2/2_0}duration"
print "measurementSiteReference ", ": duration"
for e1, e2 in zip(root.iter(t1), root.iter(t2)):
print e1.attrib['id'] , ":", e2.text
<强>结果强>
>>>
measurementSiteReference : duration
GEO03_D4T-RWS_T_0317_ID_324 : 34
GEO01_Z_RWSTRN054 : 56
RWS01_MONIBAS_0091hrr0350ra0 : 23
>>>