好的,我有一些XML文件看起来像这样,实际上它是LandXML文件,但与任何其他XML一样。我需要用python解析它(目前我正在使用ElementTree lib)以循环子元素<CoordGeom></CoordGeom>
并使列表依赖于子 - 它可以是lineList,spiralList和curveList。该列表的内容是属性值(不需要名称)作为每个对象的嵌套列表。类似的东西:
lineList=[[10.014571204947,209.285340662374,776.719431311241 -399.949629732524,813.113864060552 -193.853052659974],[287.308329990254,277.363320844698,558.639337133827 380.929458057393,293.835705515579 463.448840215686]]
以下是示例代码:
<?xml version="1.0"?>
<LandXML xmlns="http://www.landxml.org/schema/LandXML-1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.landxml.org/schema/LandXML-1.2 http://www.landxml.org/schema/LandXML-1.2/LandXML-1.2.xsd" date="2015-05-15" time="09:47:43" version="1.2" language="English" readOnly="false">
<Units>
<Metric areaUnit="squareMeter" linearUnit="meter" volumeUnit="cubicMeter" temperatureUnit="celsius" pressureUnit="milliBars" diameterUnit="millimeter" angularUnit="decimal degrees" directionUnit="decimal degrees"></Metric>
</Units>
<Project name="C:\Users\Rade\Downloads\Situacija i profili.dwg"></Project>
<Application name="AutoCAD Civil 3D" desc="Civil 3D" manufacturer="Autodesk, Inc." version="2014" manufacturerURL="www.autodesk.com/civil" timeStamp="2015-05-15T09:47:43"></Application>
<Alignments name="">
<Alignment name="Proba" length="1201.057158008475" staStart="0." desc="">
<CoordGeom>
<Line dir="10.014571204947" length="209.285340662374">
<Start>776.719431311241 -399.949629732524</Start>
<End>813.113864060552 -193.853052659974</End>
</Line>
<Spiral length="435.309621307305" radiusEnd="300." radiusStart="INF" rot="cw" spiType="clothoid" theta="41.569006803911" totalY="101.382259815422" totalX="412.947724836996" tanLong="298.633648469722" tanShort="152.794210168398">
<Start>813.113864060552 -193.853052659974</Start>
<PI>865.04584458778 100.230482065888</PI>
<End>785.087350093002 230.433054310499</End>
</Spiral>
<Curve rot="cw" chord="150.078507004323" crvType="arc" delta="28.970510103309" dirEnd="299.475054297727" dirStart="328.445564401036" external="9.849481983234" length="151.689236185509" midOrd="9.536387074322" radius="300." tangent="77.502912753511">
<Start>785.087350093002 230.433054310499</Start>
<Center>529.44434090873 73.440532656728</Center>
<End>677.05771309169 334.61153478517</End>
<PI>744.529424397382 296.476647100012</PI>
</Curve>
<Spiral length="127.409639008589" radiusEnd="INF" radiusStart="300." rot="cw" spiType="clothoid" theta="12.166724307463" totalY="8.989447716697" totalX="126.8363181841" tanLong="85.141254974713" tanShort="42.653117896421">
<Start>677.05771309169 334.61153478517</Start>
<PI>639.925187941987 355.598770007863</PI>
<End>558.639337133827 380.929458057393</End>
</Spiral>
<Line dir="287.308329990254" length="277.363320844698">
<Start>558.639337133827 380.929458057393</Start>
<End>293.835705515579 463.448840215686</End>
</Line>
</CoordGeom>
<Profile name="Proba">
<ProfAlign name="Niveleta">
<PVI>0. 329.48636525895</PVI>
<CircCurve length="69.993187715052" radius="5000.">512.581836381869 330.511528931714</CircCurve>
<CircCurve length="39.994027682446" radius="5000.">948.834372016349 337.491569501865</CircCurve>
<PVI>1201.057158008475 339.509351789802</PVI>
</ProfAlign>
</Profile>
</Alignment>
</Alignments>
</LandXML>
答案 0 :(得分:1)
这个示例远非生产就绪,但它应该包含您需要的所有内容,以实现完整的解决方案。这只搜索行,不知道它需要在其他类型的几何上查找哪些属性,并且没有任何类型的错误处理。
它并不漂亮。
from lxml import etree
doc = etree.parse('/tmp/stuff.xml')
geometry = doc.find('.//{http://www.landxml.org/schema/LandXML-1.2}CoordGeom')
line_list = []
for child in geometry:
child_list = []
if child.tag == '{http://www.landxml.org/schema/LandXML-1.2}Line':
child_list.append(float(child.attrib['dir']))
child_list.append(float(child.attrib['length']))
start, end = child.find('{http://www.landxml.org/schema/LandXML-1.2}Start'), child.find('{http://www.landxml.org/schema/LandXML-1.2}End')
child_list.extend([float(coord) for coord in start.text.split(' ')])
child_list.extend([float(coord) for coord in end.text.split(' ')])
line_list.append(child_list)
print line_list
如果您想扩展此示例,请对lxml tutorial进行彻底的直读。我用过的所有内容都在教程中。