Question

我在xml文件中获取某些值时遇到麻烦。错误 IndexError: list index out of range

XML

<?xml version="1.0" encoding="UTF-8"?>
<nfeProc xmlns="http://www.portalfiscal.inf.br/nfe" versao="3.10">
    <NFe xmlns="http://www.portalfiscal.inf.br/nfe">
        <infNFe Id="NFe35151150306471000109550010004791831003689145" versao="3.10">
            <ide>
                <nNF>479183</nNF>
            </ide>
            <emit>
                <CNPJ>3213213212323</CNPJ>
            </emit>
            <det nItem="1">
                <prod>
                    <cProd>7030-314</cProd>
                </prod>
                <imposto>
                    <ICMS>
                        <ICMS10>
                            <orig>1</orig>
                            <CST>10</CST>
                            <vICMS>10.35</vICMS>
                            <vICMSST>88.79</vICMSST>
                        </ICMS10>
                    </ICMS>
                </imposto>
            </det>
            <det nItem="2">
                <prod>
                    <cProd>7050-6</cProd>
                </prod>
                <imposto>
                    <ICMS>
                        <ICMS00>
                            <orig>1</orig>
                            <CST>00</CST>
                            <vICMS>7.49</vICMS>
                        </ICMS00>
                    </ICMS>
                </imposto>
            </det>
        </infNFe>
    </NFe>
</nfeProc>

我从XML获取值，在某些xml中可以使用 vICMS 和 vICMSST 标记：

vicms = doc.getElementsByTagName（'vICMS'）[i] .firstChild.nodeValue

vicmsst = doc.getElementsByTagName（'vICMSST'）[1] .firstChild.nodeValue

返回：

首先返回：

print vicms
>> 10.35
print vicmsst
>> 88.79

第二次 imposto CRASHES，因为找不到 vICMSST 标签......

**IndexError: list index out of range**

测试它的最佳形式是什么？我正在使用xml.etree.ElementTree：

我的代码：

import os
import sys
import subprocess
import base64,xml.dom.minidom
from xml.dom.minidom import Node
import glob
import xml.etree.ElementTree as ET

origem = 0
# only loops over XML documents in folder
for file in glob.glob("*.xml"):    
    f = open("%s" % file,'r')
    data = f.read()
    i = 0
    doc = xml.dom.minidom.parseString(data)
    for topic in doc.getElementsByTagName('emit'):
       #Get Fiscal Number
       nnf= doc.getElementsByTagName('nNF')[i].firstChild.nodeValue
       print 'Fiscal Number  %s' % nnf
       print '\n'
       for prod in doc.getElementsByTagName('det'):
            vicms = 0
            vicmsst = 0
            #Get value of ICMS
            vicms = doc.getElementsByTagName('vICMS')[i].firstChild.nodeValue
            #Get value of VICMSST
            vicmsst = doc.getElementsByTagName('vICMSST')[i].firstChild.nodeValue   
            #PRINT INFO
            print 'ICMS %s' % vicms
            print 'Valor do ICMSST: %s' % vicmsst
            print '\n\n'
            i +=1
print '\n\n'

Answer 1

XML文档中只有一个vICMSST标记。因此，当i=1时，以下行会返回IndexError。

vicmsst = doc.getElementsByTagName('vICMSST')[1].firstChild.nodeValue

您可以将其重组为：

try:
    vicmsst = doc.getElementsByTagName('vICMSST')[i].firstChild.nodeValue
except IndexError:
    # set a default value or deal with this how you like

如果不了解更多关于您正在尝试做的事情，很难说您应该对异常做些什么。

Answer 2

您在代码中犯了几个常见错误。

不要使用计数器索引到您不知道长度的列表。通常，使用for .. in进行迭代比使用索引要好得多。
你有许多你似乎没有使用的进口，摆脱它们。
您可以使用minidom，但ElementTree更适合您的任务，因为它支持使用XPath搜索节点并支持XML命名空间。
不要将XML文件作为字符串读取，然后使用parseString。让XML解析器直接处理文件。这样就可以正确处理所有与文件编码相关的问题。

以下内容远比原始方法好很多。

import glob
import xml.etree.ElementTree as ET

def get_text(context_elem, xpath, xmlns=None):
    """ helper function that gets the text value of a node """
    node = context_elem.find(xpath, xmlns)
    if (node != None):
        return node.text
    else:
        return ""

# set up XML namespace URIs
xmlns = {
    "nfe": "http://www.portalfiscal.inf.br/nfe"
}

for path in glob.glob("*.xml"):
    doc = ET.parse(path)

    for infNFe in doc.iterfind('.//nfe:infNFe', xmlns):
        print 'Fiscal Number\t%s' % get_text(infNFe, ".//nfe:nNF", xmlns)

        for det in infNFe.iterfind(".//nfe:det", xmlns):
            print ' ICMS\t%s' % get_text(det, ".//nfe:vICMS", xmlns)
            print ' Valor do ICMSST:\t%s' % get_text(det, ".//nfe:vICMSST", xmlns)

print '\n\n'

Python：XML列表索引超出范围

2 个答案: