使用minidom的xml到csv

时间:2013-07-19 17:55:47

标签: python xml minidom

我的程序有问题,我需要一些帮助。 我需要带有这些东西的saya.txt:

+,adj,teçste

+,adj,oiaã

+,adv,123

+,adv,oshi

- ,adv,teste1

- ,adv,oi1

但我只能在我的saida.txt中获取这些东西:

+, adv, 123 
+, adv, oshi 
-, adv ,teste1
-, adv ,oi1

我的xml(“pedaco.xml”)是:

<data>
      <ver>
        <pontuacao>+</pontuacao>
        <nuver>palavra1</nuver>
        <tiver>palavra2</tiver>
        <cl>
         <nocl>adj</nocl>
             <an>teçste</an>
             <an>oiaã</an>
        </cl>
        <cl> 
            <nocl> adv</nocl>
            <an> 123 </an>
            <an> oshi </an>
        </cl>
    </ver>

      <ver>
        <pontuacao>-</pontuacao>
        <nuver>palavra3</nuver>
        <tiver>palavra4</tiver>
        <cl>
         <nocl>adv</nocl>
             <an>teste1</an>
             <an>oi1</an>
        </cl>

      </ver>
     </data>

我的完整代码是

 # -*- coding: utf-8 -*
from xml.dom import minidom
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
xmldoc = minidom.parse("pedaco.xml")

arquivo = open('saida.txt','w')

ver = xmldoc.getElementsByTagName('ver')
for node in ver:
  nuver = node.getElementsByTagName('nuver')
  tiver = node.getElementsByTagName('tiver')

  pontuacao = node.getElementsByTagName('pontuacao')
  cl = node.getElementsByTagName('cl')

  for c in cl:
    an = c.getElementsByTagName('an')
    nocl = c.getElementsByTagName('nocl')


  for a in pontuacao:
    printando1 = a.childNodes[0].nodeValue
    for b in nocl:
      printando2 = b.childNodes[0].nodeValue
      for c in an:      
    printando3 = c.childNodes[0].nodeValue
    arquivo.write(printando1+",")
    arquivo.write(printando2+",")
    arquivo.write(printando3+"\n")

0 个答案:

没有答案