Python从单个标记解析XML变量

时间:2014-03-24 00:47:51

标签: python xml parsing python-2.7 xml-parsing

我有一个类似下面代码的XML文件:

<spotter num="0187" report_at="2014-03-15 20:10:25" lat="49.8696518" lng="-80.0973129"callsign="wxman132" active="1" public="" gps="0" phone="" email="addu@nnu.nne" first="" last=""></spotter>

我尝试过使用dom.minidom,但是如何轻松地从XML文件中解析出lat和lng变量值?

提前感谢您的帮助!

2 个答案:

答案 0 :(得分:4)

您需要使用XML解析器,例如ElementTreeBeautifulSouplxml

以下是使用标准库中的ElementTree的示例:

from xml.etree import ElementTree as ET

tree = ET.fromstring("""
<test>
    <spotter num="0187" report_at="2014-03-15 20:10:25" lat="49.8696518" lng="-80.0973129" callsign="wxman132" active="1" public="" gps="0" phone="" email="addu@nnu.nne" first="" last=""/>
</test>""")
spotter = tree.find('.//spotter')
print spotter.attrib['lat'], spotter.attrib['lng']

以下是使用BeautifulSoup的示例:

from bs4 import BeautifulSoup

data = '<spotter num="0187" report_at="2014-03-15 20:10:25" lat="49.8696518" lng="-80.0973129" callsign="wxman132" active="1" public="" gps="0" phone="" email="addu@nnu.nne" first="" last=""/>'    
soup = BeautifulSoup(data)    

spotter = soup.spotter
print spotter['lat'], spotter['lng']

两者都打印:

49.8696518 -80.0973129
工作结构良好的xml结构

BeautifulSoup更宽容(参见我不得不编辑xml以使其适用于ElementTree),实际上工作起来要容易得多用。

希望有所帮助。

答案 1 :(得分:2)

Pyparsing有一个内置方法,用于从HTML标记中提取属性,而无需为整个页面构建完整的对象模型。

html = """
<spotter num="0187" report_at="2014-03-15 20:10:25" lat="49.8696518" lng="-80.0973129" callsign="wxman132" active="1" public="" gps="0" phone="" email="addu@nnu.nne" first="" last="">

I've tried using dom.minidom, but how can I easily parse out the lat and lng variable values fro
<spotter num="0188" report_at="2014-03-15 20:11:25" lat="59.8696518" lng="-82.0973129" callsign="wxman132" active="1" public="" gps="0" phone="" email="addu@nnu.nne" first="" last="">

"""

from pyparsing import makeHTMLTags

spotterTag, spotterEndTag = makeHTMLTags("spotter")

for spotter in spotterTag.searchString(html):
    print spotter.report_at
    print spotter.num
    print spotter.lat
    print spotter.lng
    print spotter.email
    print

打印

2014-03-15 20:10:25
0187
49.8696518
-80.0973129
addu@nnu.nne

2014-03-15 20:11:25
0188
59.8696518
-82.0973129
addu@nnu.nne