Question

假设我有这个xml文件：

<article-set xmlns:ns0="http://casfwcewf.xsd" format-version="5">
<article>
 <article id="11234">
     <source>
     <hostname>some hostname for 11234</hostname>
     </source>
     <feed>
         <type weight=0.32>RSS</type>
     </feed>
     <uri>some uri for 11234</uri>
 </article>
 <article id="63563">
     <source>
     <hostname>some hostname for 63563 </hostname>
     </source>
     <feed>
         <type weight=0.86>RSS</type>
     </feed>
     <uri>some uri  for 63563</uri>
  </article>
.
.
.
</article></article-set>

我想要的是，在RSS中为每个文章ID打印整个文档的特定属性权重（如下所示）。

id=11234 
weight= 0.32


id=63563 
weight= 0.86
.
.
.

我使用此代码执行此操作，

from lxml import etree
tree = etree.parse("C:\\Users\\Me\\Desktop\\public.xml")


for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
        print(article_id,weight)

它没有用，有人可以帮我吗？

Answer 1

其中一个可能对您有用：

在此版本中，请注意在=的调用中添加了tree.xpath()：

from lxml import etree
tree = etree.parse("news.xml")


for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
        print(article_id,weight)

请注意，我已将tree.xpath()替换为article.xpath()：

from lxml import etree
tree = etree.parse("news.xml")

for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in article.xpath("./feed/type/@weight"):
        print(article_id,weight)

Answer 2

如果你真的想这样做，可以用两行来完成。

weight

我使用的一个xml检查器坚持使用etree值的双引号。 {{1}}在xml上嘶哑，直到我删除文件中的第一行;我不知道为什么。

获取元素的属性及其对应的Id

2 个答案: