如何在python(lxml)中过滤后访问kml / xml属性?

时间:2016-07-26 23:40:55

标签: python python-2.7 xml-parsing lxml kml

我已经环顾四周,似乎无法找到问题的解决方案。我的根本问题是我需要找到其子多边形包含给定纬度/经度的点的所有KML元素的名称。

环顾四周,我发现使用keytree,shapely和lxml我可以将所有KML元素过滤到有问题的多边形,然后访问它们的父节点。但是,当我尝试访问父级属性时,我会不断获得一个空列表。我尝试过以下方法:

def __init__(self):
    root=etree.fromstring(open("Example.kml", "r").read())
    kmlns = root.tag.split("}")[0][1:]
    polygons=root.findall(".//{%s}Polygon"%kmlns)
    p = Point(-128.1605,52.474)  #this point exists in one of the polygons
    hits = filter(
        lambda e: shape(keytree.geometry(e)).contains(p),
        polygons)

    print hits
    hit_parent=hits[0].getparent()
    print hit_parent.attrib#this prints {}

我能够通过pycharm中的调试器找到多边形所在的行;据此,hits [0]有一个sourceline属性,当我在我的KML文档中找到该行号时,多边形确实包含了该点。向上滚动到多边形的父级我发现它有属性(即不是空列表)。我是xml和kml解析的新手;我在错误的地方看?这是kml中的多边形及其父级:

<Placemark>
            <name>THIS IS THE NAME</name>
            <visibility>0</visibility>
            <styleUrl>#falseColor184010</styleUrl>
            <ExtendedData>
                <SchemaData schemaUrl="#S_AL_TA_BC_2_41_eng_SSSSISSSSSSSSSSSSSSSSSSSSS10">
                    <SimpleData name="ACQTECH">Computed</SimpleData>
                    <SimpleData name="METACOVER">Partial</SimpleData>
                    <SimpleData name="CREDATE">20030416</SimpleData>
                    <SimpleData name="REVDATE">20130504</SimpleData>
                    <SimpleData name="ACCURACY">-1</SimpleData>
                    <SimpleData name="PROVIDER">Federal</SimpleData>
                    <SimpleData name="DATASETNAM">BC</SimpleData>
                    <SimpleData name="SPECVERS">1.1</SimpleData>
                    <SimpleData name="NID">7103157bba3511d892e2080020a0f4c9</SimpleData>
                    <SimpleData name="ALCODE">07876</SimpleData>
                    <SimpleData name="LANGUAGE1">English</SimpleData>
                    <SimpleData name="NAME1">NEEKAS 4</SimpleData>
                    <SimpleData name="LANGUAGE2">French</SimpleData>
                    <SimpleData name="NAME2">NEEKAS NO 4</SimpleData>
                    <SimpleData name="LANGUAGE3">No Language</SimpleData>
                    <SimpleData name="NAME3">NULL</SimpleData>
                    <SimpleData name="LANGUAGE4">No Language</SimpleData>
                    <SimpleData name="NAME4">NULL</SimpleData>
                    <SimpleData name="LANGUAGE5">No Language</SimpleData>
                    <SimpleData name="NAME5">NULL</SimpleData>
                    <SimpleData name="JUR1">BC</SimpleData>
                    <SimpleData name="JUR2"></SimpleData>
                    <SimpleData name="JUR3"></SimpleData>
                    <SimpleData name="JUR4"></SimpleData>
                    <SimpleData name="ALTYPE">Indian Reserve</SimpleData>
                    <SimpleData name="WEBREF">http://clss.nrcan.gc.ca/map-carte/mapbrowser-navigateurcartographique-eng.php?cancode=07876</SimpleData>
                </SchemaData>
            </ExtendedData>
            <Polygon>
                <outerBoundaryIs>
                    <LinearRing>
                        <coordinates>
                            -128.1615722,52.47385589999999,0 -128.1618475,52.47338730000003,0 -128.1623126999999,52.47275560000004,0 -128.1622705,52.47253640000001,0 -128.162017,52.47243320000002,0 -128.1619326,52.4722527,0 -128.1618904,52.4721108,0 -128.161827,52.47202060000003,0 -128.1615523,52.47204629999998,0 -128.1613199,52.47211069999996,0 -128.1607705,52.47205899999999,0 -128.1604538,52.47172369999999,0 -128.1600750999999,52.47149440000001,0 -128.1600821,52.47510580000001,0 -128.1615621,52.47510469999996,0 -128.1615294999999,52.474926,0 -128.1615508,52.47452629999999,0 -128.1615298,52.47416529999997,0 -128.1615722,52.47385589999999,0 
                        </coordinates>
                    </LinearRing>
                </outerBoundaryIs>
            </Polygon>

我想得到&#34;这就是名字&#34;来自多边形的父级。

1 个答案:

答案 0 :(得分:1)

您的目标文字不是任何元素的属性。给定<Polygon>作为上下文元素,您希望转到父元素<Placemark>,然后获取其子元素<name>。这可以使用XPath在一行中完成:

....
print hits
hit_parent = hits[0].find("./../{%s}name"%kmlns)
print hit_parent.text