XML到CSV的问题

时间:2018-04-10 20:41:33

标签: python xml elementtree

我有一个XML文件,其中包含来自文献检索的引文。我试图将其解析为CSV以使用excel打开,仅导入一些节点。 XML文件有几千个条目。其中一个条目是:

<records>
<rec resultID="1">
    <controlInfo>
      <bkinfo>
        <btl>Effect of an intervention based on basic Buddhist principles on the spiritual well-being of patients with terminal cancer.</btl>
      </bkinfo>
      <dissinfo />
      <jinfo>
        <jtl>European Journal of Oncology Nursing</jtl>
        <issn>14623889</issn>
      </jinfo>
      <pubinfo>
        <dt year="2017" month="12" day="01">Dec2017</dt>
        <vid>31</vid>
      </pubinfo>
      <artinfo>
        <ui type="doi">10.1016/j.ejon.2017.08.005</ui>
        <ppf>46</ppf>
        <ppct>6</ppct>
        <formats />
        <tig>
          <atl>Effect of an intervention based on basic Buddhist principles on the spiritual well-being of patients with terminal cancer.</atl>
        </tig>
        <aug>
          <au>Chimluang, Janya</au>
          <au>Thanasilp, Sureeporn</au>
          <affil>Faculty of Nursing, Chulalongkorn University, Bangkok, Thailand</affil>
        </aug>
        <ab>Purpose To evaluate the effect of an intervention based on basic Buddhist principles on the spiritual well-being of patients with terminal cancer. Methods This quasi-experimental research study had pre- and post-test control groups. The experimental group received conventional care and an intervention based on basic Buddhist principles for three consecutive days, including seven activities based on precept activities, concentration activities and wisdom activities. The control group received conventional care alone. Results Forty-eight patients participated in this study: 23 in the experimental group and 25 in the control group. Their mean age was 53 (standard deviation 10) years. The spiritual well-being of participants in the experimental group was significantly higher than that of participants in the control group at the second post-test ( P &lt; 0.05). Conclusions An intervention based on basic Buddhist principles improved the spiritual well-being of patients with terminal cancer. This result supports the beneficial effects of implementing this type of intervention for patients with terminal cancer.</ab>
        <pubtype>Academic Journal</pubtype>
        <doctype>research</doctype>
        <doctype>Article</doctype>
      </artinfo>
      <language>English</language>
    </controlInfo>
    <displayInfo>
      <pLink>
        <url>http://search.ebscohost.com/login.aspx?direct=true&amp;db=jlh&amp;AN=126392076&amp;site=ehost-live</url>
      </pLink>
    </displayInfo>
</rec>
</records>

我想做的是导入:

  • 第一个au(作者)
  • jtl(期刊名称)
  • dt(日期)
  • vid(卷号)
  • ppf(第一页)
  • ppct(页数)
  • btl(文章标题)
  • ab(摘要)

我正在尝试以下代码:

import xml.etree.ElementTree as ET
import csv

tree = ET.parse("citations.xml")
root = tree.getroot()

# open a file for writing

citation_data = open('test.csv', 'w')

# create the csv writer object

csvwriter = csv.writer(citation_data)

count = 0

head = ['Author','Title']
csvwriter.writerow(head)

for member in root.findall('records'):
    citation = []
    au = member.find('rec').find('controlInfo').find('artinfo').find('aug').find('au').text
    citation.append(au)
    btl = member.find('rec').find('controlInfo').find('bkinfo').find('btl').text
    citation.append(btl)
    csvwriter.writerow(citation)
citation_data.close()

我收到以下错误:

Traceback (most recent call last):
  File "test.py", line 22, in <module>
    au = member.find('controlInfo').find('artinfo').find('aug').find('au').text
AttributeError: 'NoneType' object has no attribute 'find'

我在这段代码上尝试了几种变种,包括没有经常出现的“.find”,但我得到了相同的东西。

我在这里找不到任何其他例子的解决方案。我希望得到一些温和的指导和帮助,因为我是python的新手,这是我的第一个项目。

由于

A

1 个答案:

答案 0 :(得分:0)

xml中的某些条目<record>缺少此结构

<artinfo>
  <aug>

因此member.find('controlInfo').find('artinfo').find('aug').find('au')无法找到其中一个代码并返回None,因此您错过了<aug><artinfo>

检查每次查找后返回值是否为None以便调用下一个查找,smth就像

a = member.find('controlInfo')
if a is not None: 
  a = a.find('artinfo') 

and so on...