将elem.attrib的多个值添加到变量

时间:2015-08-05 14:55:11

标签: python xml-parsing elementtree

最近开始使用Python和ElementTree来实现非常具体的功能。我觉得我几乎就在那里,但有一件事情我可以做得很好。我正在查询xml文件并撤回相关数据 - 然后将该数据放入csv文件中。这一切都有效,但问题是elem.attrib [" text"]实际上返回多行 - 当我将变量放入变量并导出到csv时它只导出第一行 - 下面是我正在使用的代码......

import os
import csv

import xml.etree.cElementTree as ET

path = "/share/new"

c = csv.writer(open("/share/redacted.csv", "wb"))

c.writerow(["S","R","T","R2","R3"])


for filename in os.listdir(path):
    if filename.endswith('.xml'):
            fullname = os.path.join(path, filename)
            tree = ET.ElementTree(file=(fullname))
            for elem in tree.iterfind('PropertyList/Property[@name="Sender"]'):
                    c1 = elem.attrib["value"]
            for elem in tree.iterfind('PropertyList/Property[@name="Recipient"]'):
                    c2 = elem.attrib["value"]
            for elem in tree.iterfind('PropertyList/Property[@name="Date"]'):
                    c3 = elem.attrib["value"]
            for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match'):
                    c4 = elem.attrib["textView"]
            for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match/Matched'):
                    c5 = elem.attrib["text"]
                    print elem.attrib["text"]
                    print c5
            c.writerow([(c1),(c2),(c3),(c4),(c5)])

最重要的部分就在靠近底部 - 印刷elem.atrrib [" text"]的输出是:

Apples
Bananas

打印c5'的输出是相同的(只是要明确苹果和香蕉在单独的线上)

但是,将c5输出到csv只输出第一行,因此只有苹果出现在csv中。

我希望这是有道理的 - 我需要做的是将苹果和香蕉输出到csv(最好在同一个细胞中)。下面是在Python 2.7开发中,但理想情况下我需要它在2.6中工作(我意识到iterfind不在2.6中 - 我已经有2个版本的代码)

我会发布xml,但它有点像野兽。 - 根据评论中的建议,这里是一个清理过的XML。

    <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<Context>
    <PropertyList duplicates="true">
        <Property name="Sender" type="string" value="S:demo1@no-one.local"/>
        <Property name="Recipient" type="string" value="RPFD:no-one.local"/>
        <Property name="Date" type="string" value="Tue, 4 Aug 2015 13:24:16 +0100"/>
    </PropertyList>
    <ChildContext>
        <ResponseList>
            <Response>
                <Description>
                    <Arg />
                    <Arg />
                </Description>
                <TextualAnalysis version="2.0">
                    <ExpressionList>
                        <Expression specified=".CLEAN.(Apples)" total="1" >
                            <Match textView="Body" truncated="false">
                                <Surrounding text="..."/>
                                <Surrounding text="How do you like them "/>
                                <Matched cleaned="true" text="Apples " type="expression"/>
                                <Surrounding text="???????? "/>
                                <Surrounding text="..."/>
                            </Match>
                        </Expression>
                    </ExpressionList>
                </TextualAnalysis>
            </Response>
        </ResponseList>
    </ChildContext>
    <ChildContext>
        <ResponseList>
            <Response>
                <Description>
                    <Arg />
                    <Arg />
                </Description>
                <TextualAnalysis version="2.0">
                    <ExpressionList>
                        <Expression specified=".CLEAN.(Bananas)" total="1" >
                            <Match textView="Attach" truncated="false">
                                <Surrounding text="..."/>
                                <Surrounding text="Also I don't like... "/>
                                <Matched cleaned="true" text="Bananas " type="expression"/>
                                <Surrounding text="!!!!!!! "/>
                                <Surrounding text="..."/>
                            </Match>
                        </Expression>
                    </ExpressionList>
                </TextualAnalysis>
            </Response>
        </ResponseList>
    </ChildContext>
</Context>

1 个答案:

答案 0 :(得分:0)

以下内容将所有文本元素连接在一起,并将它们放在CSV中相同单元格中的单独行上。您可以将'\ n'分隔符更改为''或','将它们放在同一行。 然而,你可能仍然遇到一些其他东西的问题 - 你没有嵌套循环,我真的不明白你想要完成什么,所以也许你有更多其中一个也是其中一个。无论如何:

        c5 = []
        for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match/Matched'):
                c5.append(elem.attrib["text"])
        c.writerow([c1, c2, c3, c4, '\n'.join(c5)])