Question

最近开始使用Python和ElementTree来实现非常具体的功能。我觉得我几乎就在那里，但有一件事情我可以做得很好。我正在查询xml文件并撤回相关数据 - 然后将该数据放入csv文件中。这一切都有效，但问题是elem.attrib [＆＃34; text＆＃34;]实际上返回多行 - 当我将变量放入变量并导出到csv时它只导出第一行 - 下面是我正在使用的代码......

import os
import csv

import xml.etree.cElementTree as ET

path = "/share/new"

c = csv.writer(open("/share/redacted.csv", "wb"))

c.writerow(["S","R","T","R2","R3"])


for filename in os.listdir(path):
    if filename.endswith('.xml'):
            fullname = os.path.join(path, filename)
            tree = ET.ElementTree(file=(fullname))
            for elem in tree.iterfind('PropertyList/Property[@name="Sender"]'):
                    c1 = elem.attrib["value"]
            for elem in tree.iterfind('PropertyList/Property[@name="Recipient"]'):
                    c2 = elem.attrib["value"]
            for elem in tree.iterfind('PropertyList/Property[@name="Date"]'):
                    c3 = elem.attrib["value"]
            for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match'):
                    c4 = elem.attrib["textView"]
            for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match/Matched'):
                    c5 = elem.attrib["text"]
                    print elem.attrib["text"]
                    print c5
            c.writerow([(c1),(c2),(c3),(c4),(c5)])

最重要的部分就在靠近底部 - 印刷elem.atrrib [＆＃34; text＆＃34;]的输出是：

Apples
Bananas

打印c5＆＃39;的输出是相同的（只是要明确苹果和香蕉在单独的线上）

但是，将c5输出到csv只输出第一行，因此只有苹果出现在csv中。

我希望这是有道理的 - 我需要做的是将苹果和香蕉输出到csv（最好在同一个细胞中）。下面是在Python 2.7开发中，但理想情况下我需要它在2.6中工作（我意识到iterfind不在2.6中 - 我已经有2个版本的代码）

我会发布xml，但它有点像野兽。 - 根据评论中的建议，这里是一个清理过的XML。

    <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<Context>
    <PropertyList duplicates="true">
        <Property name="Sender" type="string" value="S:demo1@no-one.local"/>
        <Property name="Recipient" type="string" value="RPFD:no-one.local"/>
        <Property name="Date" type="string" value="Tue, 4 Aug 2015 13:24:16 +0100"/>
    </PropertyList>
    <ChildContext>
        <ResponseList>
            <Response>
                <Description>
                    <Arg />
                    <Arg />
                </Description>
                <TextualAnalysis version="2.0">
                    <ExpressionList>
                        <Expression specified=".CLEAN.(Apples)" total="1" >
                            <Match textView="Body" truncated="false">
                                <Surrounding text="..."/>
                                <Surrounding text="How do you like them "/>
                                <Matched cleaned="true" text="Apples " type="expression"/>
                                <Surrounding text="???????? "/>
                                <Surrounding text="..."/>
                            </Match>
                        </Expression>
                    </ExpressionList>
                </TextualAnalysis>
            </Response>
        </ResponseList>
    </ChildContext>
    <ChildContext>
        <ResponseList>
            <Response>
                <Description>
                    <Arg />
                    <Arg />
                </Description>
                <TextualAnalysis version="2.0">
                    <ExpressionList>
                        <Expression specified=".CLEAN.(Bananas)" total="1" >
                            <Match textView="Attach" truncated="false">
                                <Surrounding text="..."/>
                                <Surrounding text="Also I don't like... "/>
                                <Matched cleaned="true" text="Bananas " type="expression"/>
                                <Surrounding text="!!!!!!! "/>
                                <Surrounding text="..."/>
                            </Match>
                        </Expression>
                    </ExpressionList>
                </TextualAnalysis>
            </Response>
        </ResponseList>
    </ChildContext>
</Context>

Answer 1

以下内容将所有文本元素连接在一起，并将它们放在CSV中相同单元格中的单独行上。您可以将'\ n'分隔符更改为''或'，'将它们放在同一行。然而，你可能仍然遇到一些其他东西的问题 - 你没有嵌套循环，我真的不明白你想要完成什么，所以也许你有更多其中一个也是其中一个。无论如何：

c5 = [] for elem in tree.iterfind('ChildContext/ResponseList/Response/TextualAnalysis/ExpressionList/Expression/Match/Matched'): c5.append(elem.attrib["text"]) c.writerow([c1, c2, c3, c4, '\n'.join(c5)])

将elem.attrib的多个值添加到变量

1 个答案: