从XML提取特定数据,然后使用Python将数据写入CSV

时间:2019-06-11 03:45:41

标签: xml python-3.x xml.etree

嗨,我想从下面的XML摘录中的XML标签“ title.block / short-title”和“ court.date.block / court.date where属性是判断”中提取标题和日期,我是Python的新手,并且没有在其中进行过多的编码。

能请你指教吗?

这是XML

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE lrs-conv PUBLIC "-//TLRGAP//DTD LRS Conversion DTD//EN" "http://dtd-server/document-store/DTD/lrs-conv.dtd">
    <lrs-conv end-page="228" runhead="YOUNG V ALLAN" series="VR" start-page="226" version="1.12"
        volume="[1959]">
        <court.block id="7088155" version="1">
            <court.name>SUPREME COURT OF VICTORIA</court.name>
        </court.block>
        <title.block id="7088154" version="1">
            <short-title>YOUNG v ALLAN</short-title>
        </title.block>
        <judge.block id="7088165" version="1">
            <judge.group>
                <judge>LOWE</judge>
                <join>, </join>
                <judge.title>J</judge.title>
            </judge.group>
        </judge.block>
        <court.date.block id="7088156" version="1">
            <court.date.group>
                <court.date type="hearing" value="19590304">4</court.date>
                <join>, </join>
                <court.date type="judgment" value="19590306">6 March 1959</court.date>
            </court.date.group>
        </court.date.block>
    </lrs-conv>

我尝试了下面的python代码,但这就像一个从一个文件中提取日期并仅打印它的开始。

import xml.etree.ElementTree as ET
tree = ET.parse(r"C:\Users\u0119342\Desktop\TEST PY\[1959] VR 226.xml")
root = tree.getroot()
for title in root.iter('short-title'):
    print(title.attrib)
    print(title.text)
for date in root.iter('court.date'):
    print(date.attrib)
    print(date.text)

这是我得到的结果

{}
YOUNG v ALLAN
{'type': 'hearing', 'value': '19590304'}
4
{'type': 'judgment', 'value': '19590306'}
6 March 1959

但是想将数据提取到csv中

Title               date type       date
YOUNG v ALLAN       judgment       6 March 1959

0 个答案:

没有答案