Python:'NoneType'类型的对象没有len()

时间:2015-12-15 18:27:06

标签: python

我遇到了一个问题,我希望stackoverflow工作人员能够提供帮助。

每当我尝试分离一类文档时,我一直收到错误消息类型为'NoneType'的对象没有len()

完整的追溯是:

C:\>C:\Python27\python.exe C:\Testing\test.py "C:\Testing\IN" "C:\Testing\Outputs" "C:\Testing\Test.csv"

C:\Testing\IN\000001.000001.xml

Traceback (most recent call last):
  File "C:\Testing\test.py", line 46, in <module>
    if len(documentclass)==0:
TypeError: object of type 'NoneType' has no len()

C:\>

以下是代码:

import csv, sys, os
import shutil
import xml.etree.ElementTree as ET


if __name__ == '__main__':
    if not (len(sys.argv) == 4):
        print 'USAGE: %s inFolder OutFolder csvFile' % (sys.argv[0])
    else:        
        inFolder    = sys.argv[1]
        outFolder   = sys.argv[2]
        className   = sys.argv[3]

        count = 0        
        for fileName in os.listdir(inFolder):
            if fileName.endswith(".pdf"):                
                baseName = fileName.split('.pdf')[0]
                pdfFile = inFolder+"\\"+baseName+".pdf"
                xmlFile = inFolder+"\\"+baseName+".xml"
                validatedXmlFile = inFolder+"\\"+baseName+".xml.validated.xml"                                
                xmlSize = os.path.getsize(xmlFile)
                pdfSize = os.path.getsize(pdfFile)                
                if xmlSize>0 and pdfSize>0:
                    print
                    print xmlFile
                    count = count + 1
                    tree = ET.parse(xmlFile)
                    root_xml = tree.getroot()
                    form_xml = root_xml[0]
                    #form_xml = root_xml[1]
                    documentclass_xml = form_xml.find('DocumentClassGlobal')                    
                    documentclassLocal_xml = form_xml.find('DocumentClassLocal')                
                    #documentclass_xml = form_xml.find('SSMClassID')
                    if documentclass_xml is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"
                    else:
                        documentclass = ""                        
                    if len(documentclass)==0:
                        documentclass = "UNKNOWN"                
                    print documentclass                 

                    if documentclass == className:                                    
                        if not os.path.exists(outFolder + "\\" + documentclass):
                            os.makedirs(outFolder + "\\" + documentclass)
                        inBaseFile = inFolder + "\\"+baseName
                        outBaseFile = outFolder + "\\" + documentclass+"\\"+baseName                                
                        inFile = inBaseFile+".pdf"
                        outFile = outBaseFile+".pdf"
                        print inFile
                        print outFile
                        shutil.copy(inBaseFile+".pdf", outBaseFile+".pdf")
                        shutil.copy(inBaseFile+".pdf.conf.xml", outBaseFile+".pdf.conf.xml")
                        shutil.copy(inBaseFile+".pdf.multi.txt", outBaseFile+".pdf.multi.txt")
                        shutil.copy(inBaseFile+".pdf.txt", outBaseFile+".pdf.txt")
                        #shutil.move(inBaseFile+".wdb", outBaseFile+".wdb")
                        shutil.copy(inBaseFile+".xml", outBaseFile+".xml")
                        if os.path.exists(inBaseFile+".xml.validated.xml"):
                            shutil.copy(inBaseFile+".xml.validated.xml", outBaseFile+".xml.validated.xml")
                        if os.path.exists(inBaseFile+".xml.validationinfo.xml"):
                            shutil.copy(inBaseFile+".xml.validationinfo.xml", outBaseFile+".xml.validationinfo.xml")


        print '%d files found and copied.' % (count)

显然,if len(documentclass)== 0返回None值。我们的想法是将None值和0值分配给documentclass - “Unknown”

到目前为止,我已经提出了以下内容,但没有成功。有什么想法吗?

非常感谢

if documentclass_xml is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"
if documentclass_xml is None:
                        documentclass = "UNKNOWN"                      
                    elif documentclassLocal_xml is None:
                        documentclass = "UNKNOWN"
                    else:
                        documentclass = ""                        
                    if len(documentclass)==0:
                        documentclass = "UNKNOWN"                
                    print documentclass

2 个答案:

答案 0 :(得分:3)

回溯告诉您documentclass的值为None。你初始化它:

                if documentclass_xml is not None:
                    documentclass = documentclass_xml.find('data').text                       
                elif documentclassLocal_xml is not None:
                    documentclass = documentclassLocal_xml.find('data').text
                    documentclass = documentclass + "_Local"
                else:
                    documentclass = ""   

因此if的至少一个分支必须为其分配None。如果节点中没有文本内容,则访问text节点的ElementTree属性将返回None。它必须是条件的第一个分支,否则尝试追加"_Local"会抛出错误。

>>> data = ET.fromstring('<test/>')
>>> data.text is None
True

因此,您正在访问一个空的<data/>节点。

答案 1 :(得分:0)

结果发现对DocumentClassGlobal和DocumentClassLocal进行了一些更改。之前,只有在documentclass_xml.find('data')。text中有值时才会创建这些对象。现在,此规则被忽略,documentclass_xml.find('data')。text可以是None值。我进行了以下调整,它完成了诀窍。谢谢mgilson指出来。

if documentclass_xml is not None and documentclass_xml.find('data').text is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None and documentclass_xml.find('data').text is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"