如何删除标记的所有属性

时间:2014-02-04 13:36:03

标签: python xml lxml

如何删除xml标记的所有属性,以便从中获取: <xml blah blah blah><xml>

使用lxml我知道我可以删除整个元素,但我没有找到任何方法来对标记进行特定操作。 (我在C#的stackoverflow上找到了解决方案,但我想要Python)。

我打开一个gpx(xml)文件,到目前为止这是我的代码(基于How do I get the whole content between two xml tags in Python?):

from lxml import etree

t = etree.parse("1.gpx")
e = t.xpath('//trk')[0]
print(e.text + ''.join(map(etree.tostring, e))).strip()

我做的另一种方法是:

from lxml import etree

TOPOGRAFIX_NS = './/{http://www.topografix.com/GPX/1/1}'
TRACKPOINT_NS = TOPOGRAFIX_NS + 'extensions/{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}TrackPointExtension/{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}'

doc1 = etree.parse("1.gpx")

for node1 in doc1.findall(TOPOGRAFIX_NS + 'trk'):
    node_to_string1 = etree.tostring(node1)
    print(node_to_string1)

但是我得到了我不想要的TOPOGRAFIX_NS属性的trk标签,在这里我想要删除标签属性。我只想得到:

<trk> all the inside content </trk>

非常感谢!

P.S。 gpx文件的内容:

<?xml version="1.0" encoding="UTF-8"?>
<gpx version="1.1" creator="Endomondo.com" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd" xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <metadata>
    <author>
      <name>Blah Blah</name>
      <email id="blah" domain="blah.com"/>
    </author>
    <link href="http://www.endomondo.com">
      <text>Endomondo</text>
    </link>
    <time>2014-01-20T10:50:28Z</time>
  </metadata>
  <trk>
    <name>Galati</name>
    <src>http://www.endomondo.com/</src>
    <link href="http://www.endomondo.com/workouts/260782567/13005122">
      <text>Galati</text>
    </link>
    <type>MOUNTAIN_BIKING</type>
    <trkseg>
      <trkpt lat="45.431074" lon="28.021038">
        <time>2013-10-20T05:49:04Z</time>
      </trkpt>

    </trkseg>
  </trk>
</gpx>

0 个答案:

没有答案