python将xml元素值提取到csv

时间:2015-10-26 14:24:34

标签: python xml

我是python的新手,所以请在我尝试解释我想要做的事情时请耐心等待

这是我的xml

<?xml version="1.0"?>
<playlist>
    <list>
        <txdate>2015-10-30</txdate>
        <channel>cake</channel>
        <name>Play List</name>
    </list>
    <eventlist>
        <event type="MEDIA">
            <title>title1</title>
            <starttype>FIX</starttype>
            <mediaid>a</mediaid>
            <onairtime>2015-10-30T13:30:00:00</onairtime>
            <som>00:00:40:03</som>
            <duration>01:15:47:15</duration>
            <reconcilekey>123</reconcilekey>
            <category>PROGRAM</category>
            <subtitles>
                <cap>CLOSED</cap>
                <file>a</file>
                <lang>ENG</lang>
                <lang>GER</lang>
            </subtitles>
        </event>
        <event type="MEDIA">
            <title>THREE DAYS AND A CHILD</title>
            <mediaid>b</mediaid>
            <onairtime>2015-10-30T14:45:47:15</onairtime>
            <som>00:00:00:00</som>
            <duration>01:19:41:07</duration>
            <reconcilekey>321</reconcilekey>
            <category>PROGRAM</category>
            <subtitles>
                <cap>CLOSED</cap>
                <file>b</file>
                <lang>ENG</lang>
                <lang>GER</lang>
            </subtitles>
        </event>
    </eventlist>
</playlist>

我想将所有mediaid值打印到文件中 到目前为止这是我的代码

import os
import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
wfile = 'new.csv'
for child in root: 
    child.find( "media type" )
    for x in child.iter("mediaid"):
        file = open(wfile, 'a')
        file.write(str(x))
    file.close

我尝试了其他一些非标准库,但我没有取得多大成功

1 个答案:

答案 0 :(得分:0)

根据您的要求(如评论中所述) -

  

来自每个<event type="MEDIA">

的媒体广告

您应该使用ElementTree的findall()方法获取event的所有type="MEDIA"元素,然后从中获取子mediaid元素。示例 -

import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
with open('new.csv','w') as outfile:
    for elem in root.findall('.//event[@type="MEDIA"]'):
            mediaidelem = elem.find('./mediaid')
            if mediaidelem is not None:
                    outfile.write("{}\n".format(mediaidelem.text))