Parse xml file python

时间:2017-06-15 09:42:36

标签: python xml

<lib name="atl80.dll" bl="0">
  <fcts>
    <fct od="15" bl="0">AtlComModuleGetClassObject</fct>
    <fct od="18" bl="1">AtlComModuleRegisterServer</fct>
    <fct od="22" bl="1">AtlComModuleUnregisterServer</fct>
    <fct od="23" bl="1">AtlUpdateRegistryFromResourceD</fct>
    <fct od="30" bl="0">AtlComPtrAssign</fct>
    <fct od="31" bl="0">AtlComQIPtrAssign</fct>
    <fct od="32" bl="0">AtlInternalQueryInterface</fct>
    <fct od="34" bl="0">AtlGetVersion</fct>
    <fct od="58" bl="0">AtlModuleAddTermFunc</fct>
    <fct od="61" bl="1">AtlCreateRegistrar</fct>
    <fct od="64" bl="0">AtlCallTermFunc</fct>

Hey Guys, i want to parse an xml file, iterate it's content and extract: [1]the lib name [2]extract fct tag text if bl = 1

How should i parse the xml and extract this info?

thanks!

1 个答案:

答案 0 :(得分:0)

Here is an example,

html = """<lib name="atl80.dll" bl="0">
  <fcts>
    <fct od="15" bl="0">AtlComModuleGetClassObject</fct>
    <fct od="18" bl="1">AtlComModuleRegisterServer</fct>
    <fct od="22" bl="1">AtlComModuleUnregisterServer</fct>
    <fct od="23" bl="1">AtlUpdateRegistryFromResourceD</fct>
    <fct od="30" bl="0">AtlComPtrAssign</fct>
    <fct od="31" bl="0">AtlComQIPtrAssign</fct>
    <fct od="32" bl="0">AtlInternalQueryInterface</fct>
    <fct od="34" bl="0">AtlGetVersion</fct>
    <fct od="58" bl="0">AtlModuleAddTermFunc</fct>
    <fct od="61" bl="1">AtlCreateRegistrar</fct>
    <fct od="64" bl="0">AtlCallTermFunc</fct>

"""


from bs4 import BeautifulSoup as b

soup = b(html, 'html.parser')
fct = soup.find_all(bl="1")
#get parent name
parent_name = fct[0].parent.parent['name']
# get all fct tag text
fct = [i.text for i in fct]

print(parent_name)
print(fct)