嗨,我想从下面的XML摘录中的XML标签“ title.block / short-title”和“ court.date.block / court.date where属性是判断”中提取标题和日期,我是Python的新手,并且没有在其中进行过多的编码。
能请你指教吗?
这是XML
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE lrs-conv PUBLIC "-//TLRGAP//DTD LRS Conversion DTD//EN" "http://dtd-server/document-store/DTD/lrs-conv.dtd">
<lrs-conv end-page="228" runhead="YOUNG V ALLAN" series="VR" start-page="226" version="1.12"
volume="[1959]">
<court.block id="7088155" version="1">
<court.name>SUPREME COURT OF VICTORIA</court.name>
</court.block>
<title.block id="7088154" version="1">
<short-title>YOUNG v ALLAN</short-title>
</title.block>
<judge.block id="7088165" version="1">
<judge.group>
<judge>LOWE</judge>
<join>, </join>
<judge.title>J</judge.title>
</judge.group>
</judge.block>
<court.date.block id="7088156" version="1">
<court.date.group>
<court.date type="hearing" value="19590304">4</court.date>
<join>, </join>
<court.date type="judgment" value="19590306">6 March 1959</court.date>
</court.date.group>
</court.date.block>
</lrs-conv>
我尝试了下面的python代码,但这就像一个从一个文件中提取日期并仅打印它的开始。
import xml.etree.ElementTree as ET
tree = ET.parse(r"C:\Users\u0119342\Desktop\TEST PY\[1959] VR 226.xml")
root = tree.getroot()
for title in root.iter('short-title'):
print(title.attrib)
print(title.text)
for date in root.iter('court.date'):
print(date.attrib)
print(date.text)
这是我得到的结果
{}
YOUNG v ALLAN
{'type': 'hearing', 'value': '19590304'}
4
{'type': 'judgment', 'value': '19590306'}
6 March 1959
但是想将数据提取到csv中
Title date type date
YOUNG v ALLAN judgment 6 March 1959