从txt文件定义XML模式使用Python脚本并将其存储在单个XML文件中

时间:2017-07-20 20:02:54

标签: python xml

我正在尝试使用.txt文件中的python编写一个xml架构。我尝试了以下代码,但它没有读取文本行中的值。

     The data is like:
      #  File format is Team:Player:Salary:Position
      New York Yankees :"Acevedo Juan"  :   900000: Pitcher
      New York Yankees :"Anderson Jason":   300000: Pitcher 
      ............

和代码:

    import re
    import xml.etree.ElementTree as ET
    root = ET.Element('root')
    root.text = '\n'    # newline before the celldata element
    f = open("C:/baseball.txt")
    lines = f.readlines()
       for l in lines:
           elems = l.split(":")
      if len(elems) == 4:
           elems = map(lambda x: x.strip(), elems)
           playerdata = ET.SubElement(root, "playerdata")
           playerdata.text = '\n'
           playerdata.tail = '\n\n'
           team = ET.SubElement(playerdata, "team")
           player = ET.SubElement(playerdata, "player")
           salary = ET.SubElement(playerdata, "salary")
           position = ET.SubElement(playerdata, "position")
      ET.dump(root)
      tree = ET.ElementTree(root)
      tree.write("test1.xml", encoding='utf-8', xml_declaration=True)

我得到的输出看起来像这样:

     <root>
     <playerdata>
     <team /><player /><salary /><position /></playerdata>

     <playerdata>
     <team /><player /><salary /><position /></playerdata>

     <playerdata>
     <team /><player /><salary /><position /></playerdata>
     .
     .
     .
     <playerdata>
     <team /><player /><salary /><position /></playerdata>

     </root>

1 个答案:

答案 0 :(得分:0)

一些变化,加上修复。变化是:

  1. 使用上下文管理器确保输入文件关闭 正确。
  2. 使用迭代文件产生每一行的事实 转(而不是f.readlines())。
  3. 将线的各部分拆分和剥离合并为一个陈述。
  4. 修复方法是在子元素创建后立即设置text值。

    import xml.etree.ElementTree as ET
    root = ET.Element('root')
    root.text = '\n'    # newline before the celldata element
    with open("C:/baseball.txt") as f:
       for l in f:
           elems = [ x.strip() for x in l.split(":") ]
           if len(elems) == 4:
               playerdata = ET.SubElement(root, "playerdata")
               playerdata.text = '\n'
               playerdata.tail = '\n\n'
               team = ET.SubElement(playerdata, "team")
               team.text = elems[0]
               player = ET.SubElement(playerdata, "player")
               player.text = elems[1]
               salary = ET.SubElement(playerdata, "salary")
               salary.text = elems[2]
               position = ET.SubElement(playerdata, "position")
               position.text = elems[3]
    
      ET.dump(root)
      tree = ET.ElementTree(root)
      tree.write("test1.xml", encoding='utf-8', xml_declaration=True)