使用Python和Elementtree,我无法将XML解析为文本行项目,因此每个行项目仅代表一个级别,不多也不少。每个行项最终将是数据库中的一个记录,以便用户可以在该字段内搜索多个术语。示例XML:
?xml version="1.0" encoding="utf-8"?>
<root>
<mainTerm>
<title>Meat</title>
<see>protein</see>
</mainTerm>
<mainTerm>
<title>Vegetables</title>
<see>starch</see>
</mainTerm>
<mainTerm>
<title>Fruit</nemod></title>
<term level="1">
<title>Apple</title>
<code>apl</code>
</term>
<term level="1">
<title>Red Delicious</title>
<code>rd</code>
<term level="2">
<title>Large Red Delicious</title>
<code>lrd</code>
</term>
<term level="2">
<title>Medium Red Delicious</title>
<code>mrd</code>
</term>
<term level="2">
<title>Small Red Delicious</title>
<code>mrd</code>
</term>
<term level="1">
<title>Grapes</title>
<code>grp</code>
</term>
<term level="1">
<title>Peaches</title>
<code>pch</code>
</term>
</mainTerm>
</root>
期望的输出:
Meat > protein
Vegetables > starch
Fruit > Apple > apl
Fruit > Apple > apl > Red Delicious > rd
Fruit > Apple > apl > Red Delicious > rd > Large Red Delicious > lrd
Fruit > Apple > apl > Red Delicious > rd > Medium Red Delicious > mrd
Fruit > Apple > apl > Red Delicious > rd > Small Red Delicious > srd
Fruit > Grapes > grp
Fruit > Peaches > pch
使用标签&#39; mainTerm&#39;很容易。解析XML,但棘手的部分是将每一行限制为只有一个级别,但同时包括文本中的上级术语。我基本上都试图“压扁”#34; XML层次结构通过创建独特的文本行,每个文本列出其父项(例如Fruit&gt; Apple&gt; apl)但不列出其兄弟(例如大红色美味,中红色美味或小红色美味)。
我意识到这可以通过首先将数据转换为关系数据库格式,然后运行查询等来实现,但我希望直接从XML中获得更直接的解决方案。
希望这有意义......谢谢
答案 0 :(得分:1)
有一个很好的工具名为xmltodict,它使得xml中的层次结构数据结构:
import json
import xmltodict
data = """your xml goes here"""
result = xmltodict.parse(data)
print(json.dumps(result, indent=4))
对于您提供的xml(进行了多次更改以使其格式正确,请参阅我的评论),它会打印出来:
{
"root": {
"mainTerm": [
{
"title": "Meat",
"see": "protein"
},
{
"title": "Vegetables",
"see": "starch"
},
{
"title": "Fruit",
"term": [
{
"@level": "1",
"title": "Apple",
"code": "apl"
},
{
"@level": "1",
"title": "Red Delicious",
"code": "rd",
"term": [
{
"@level": "2",
"title": "Large Red Delicious",
"code": "lrd"
},
{
"@level": "2",
"title": "Medium Red Delicious",
"code": "mrd"
},
{
"@level": "2",
"title": "Small Red Delicious",
"code": "mrd"
}
]
},
{
"@level": "1",
"title": "Grapes",
"code": "grp"
},
{
"@level": "1",
"title": "Peaches",
"code": "pch"
}
]
}
]
}
}