Python XML findall不起作用

时间:2017-11-16 19:43:41

标签: python xml

我正在尝试使用findall来选择某些xml元素,但我无法获得任何结果。

import xml.etree.ElementTree as ET
import sys

storefront = sys.argv[1]

xmlFileName = 'promotions{0}.xml'

xmlFile = xmlFileName.format(storefront)

csvFileName = 'hrz{0}.csv'
csvFile = csvFileName.format(storefront)
ET.register_namespace('', "http://www.demandware.com/xml/impex/promotion/2008-01-31")
tree = ET.parse(xmlFile)

root = tree.getroot()
print('------------------Generate test-------------\n')



csv = open(csvFile,'w')
n = 0
for child in root.findall('campaign'):
    print(child.attrib['campaign-id'])
    print(n)
    n+=1

XML看起来像这样:

  <?xml version="1.0" encoding="UTF-8"?>
<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
    <campaign campaign-id="10off-310781">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>

    <campaign campaign-id="MNT-deals">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <start-date>2017-07-03T22:00:00.000Z</start-date>
        <end-date>2017-07-31T22:00:00.000Z</end-date>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>

    <campaign campaign-id="black-friday">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <start-date>2017-11-23T23:00:00.000Z</start-date>
        <end-date>2017-11-24T23:00:00.000Z</end-date>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
        <custom-attributes>
            <custom-attribute attribute-id="expires_date">2017-11-29</custom-attribute>
        </custom-attributes>
    </campaign>

    <promotion-campaign-assignment promotion-id="winter17-new-bubble" campaign-id="winter17-new-bubble">
        <qualifiers match-mode="any">
            <customer-groups/>
            <source-codes/>
            <coupons/>
        </qualifiers>
        <rank>100</rank>
    </promotion-campaign-assignment>

    <promotion-campaign-assignment promotion-id="xmas" campaign-id="xmas">
        <qualifiers match-mode="any">
            <customer-groups/>
            <source-codes/>
            <coupons/>
        </qualifiers>
    </promotion-campaign-assignment>

</promotions>

任何想法我做错了什么?  我尝试过在stackoverflow上找到的不同的解决方案,但似乎没有什么对我有效(从我尝试的事情)。  该列表为空。  对不起,如果这是非常明显的我是python的新手。

1 个答案:

答案 0 :(得分:3)

正如@MartijnPieters提到的here,etree&#39; .findall使用名称空间参数,而.register_namespace()用于树的xml输出。因此,请考虑使用显式前缀映射默认命名空间。下面使用 doc ,但甚至可以是 cosmin

此外,将withenumerate()甚至csv模块视为打印和CSV输出的更好处理程序。

import csv
...

root = tree.getroot()
print('------------------Generate test-------------\n')

with open(csvFile, 'w') as f:
    c = csv.writer(f, lineterminator='\n')

    for n, child in enumerate(root.findall('doc:campaign', namespaces={'doc':'http://www.demandware.com/xml/impex/promotion/2008-01-31'})):
        print(child.attrib['campaign-id'])
        print(n)
        c.writerow([child.attrib['campaign-id']])

# ------------------Generate test-------------

# 10off-310781
# 0
# MNT-deals
# 1
# black-friday
# 2