我遇到了一个问题,我希望stackoverflow工作人员能够提供帮助。
每当我尝试分离一类文档时,我一直收到错误消息类型为'NoneType'的对象没有len()。
完整的追溯是:
C:\>C:\Python27\python.exe C:\Testing\test.py "C:\Testing\IN" "C:\Testing\Outputs" "C:\Testing\Test.csv"
C:\Testing\IN\000001.000001.xml
Traceback (most recent call last):
File "C:\Testing\test.py", line 46, in <module>
if len(documentclass)==0:
TypeError: object of type 'NoneType' has no len()
C:\>
以下是代码:
import csv, sys, os
import shutil
import xml.etree.ElementTree as ET
if __name__ == '__main__':
if not (len(sys.argv) == 4):
print 'USAGE: %s inFolder OutFolder csvFile' % (sys.argv[0])
else:
inFolder = sys.argv[1]
outFolder = sys.argv[2]
className = sys.argv[3]
count = 0
for fileName in os.listdir(inFolder):
if fileName.endswith(".pdf"):
baseName = fileName.split('.pdf')[0]
pdfFile = inFolder+"\\"+baseName+".pdf"
xmlFile = inFolder+"\\"+baseName+".xml"
validatedXmlFile = inFolder+"\\"+baseName+".xml.validated.xml"
xmlSize = os.path.getsize(xmlFile)
pdfSize = os.path.getsize(pdfFile)
if xmlSize>0 and pdfSize>0:
print
print xmlFile
count = count + 1
tree = ET.parse(xmlFile)
root_xml = tree.getroot()
form_xml = root_xml[0]
#form_xml = root_xml[1]
documentclass_xml = form_xml.find('DocumentClassGlobal')
documentclassLocal_xml = form_xml.find('DocumentClassLocal')
#documentclass_xml = form_xml.find('SSMClassID')
if documentclass_xml is not None:
documentclass = documentclass_xml.find('data').text
elif documentclassLocal_xml is not None:
documentclass = documentclassLocal_xml.find('data').text
documentclass = documentclass + "_Local"
else:
documentclass = ""
if len(documentclass)==0:
documentclass = "UNKNOWN"
print documentclass
if documentclass == className:
if not os.path.exists(outFolder + "\\" + documentclass):
os.makedirs(outFolder + "\\" + documentclass)
inBaseFile = inFolder + "\\"+baseName
outBaseFile = outFolder + "\\" + documentclass+"\\"+baseName
inFile = inBaseFile+".pdf"
outFile = outBaseFile+".pdf"
print inFile
print outFile
shutil.copy(inBaseFile+".pdf", outBaseFile+".pdf")
shutil.copy(inBaseFile+".pdf.conf.xml", outBaseFile+".pdf.conf.xml")
shutil.copy(inBaseFile+".pdf.multi.txt", outBaseFile+".pdf.multi.txt")
shutil.copy(inBaseFile+".pdf.txt", outBaseFile+".pdf.txt")
#shutil.move(inBaseFile+".wdb", outBaseFile+".wdb")
shutil.copy(inBaseFile+".xml", outBaseFile+".xml")
if os.path.exists(inBaseFile+".xml.validated.xml"):
shutil.copy(inBaseFile+".xml.validated.xml", outBaseFile+".xml.validated.xml")
if os.path.exists(inBaseFile+".xml.validationinfo.xml"):
shutil.copy(inBaseFile+".xml.validationinfo.xml", outBaseFile+".xml.validationinfo.xml")
print '%d files found and copied.' % (count)
显然,if len(documentclass)==
0返回None值。我们的想法是将None值和0值分配给documentclass - “Unknown”
到目前为止,我已经提出了以下内容,但没有成功。有什么想法吗?
非常感谢
if documentclass_xml is not None:
documentclass = documentclass_xml.find('data').text
elif documentclassLocal_xml is not None:
documentclass = documentclassLocal_xml.find('data').text
documentclass = documentclass + "_Local"
if documentclass_xml is None:
documentclass = "UNKNOWN"
elif documentclassLocal_xml is None:
documentclass = "UNKNOWN"
else:
documentclass = ""
if len(documentclass)==0:
documentclass = "UNKNOWN"
print documentclass
答案 0 :(得分:3)
回溯告诉您documentclass
的值为None
。你初始化它:
if documentclass_xml is not None:
documentclass = documentclass_xml.find('data').text
elif documentclassLocal_xml is not None:
documentclass = documentclassLocal_xml.find('data').text
documentclass = documentclass + "_Local"
else:
documentclass = ""
因此if
的至少一个分支必须为其分配None
。如果节点中没有文本内容,则访问text
节点的ElementTree
属性将返回None
。它必须是条件的第一个分支,否则尝试追加"_Local"
会抛出错误。
>>> data = ET.fromstring('<test/>')
>>> data.text is None
True
因此,您正在访问一个空的<data/>
节点。
答案 1 :(得分:0)
结果发现对DocumentClassGlobal和DocumentClassLocal进行了一些更改。之前,只有在documentclass_xml.find('data')。text中有值时才会创建这些对象。现在,此规则被忽略,documentclass_xml.find('data')。text可以是None值。我进行了以下调整,它完成了诀窍。谢谢mgilson指出来。
if documentclass_xml is not None and documentclass_xml.find('data').text is not None:
documentclass = documentclass_xml.find('data').text
elif documentclassLocal_xml is not None and documentclass_xml.find('data').text is not None:
documentclass = documentclassLocal_xml.find('data').text
documentclass = documentclass + "_Local"