TypeError:类型' PSLiteral'的参数是不可迭代的

时间:2015-07-22 14:16:01

标签: python csv data-cleansing pdftotext

我尝试在将其写入csv文件之前使用我的pdfform-scraper-script删除一些隐藏的输入。但我一直收到标题中提到的错误。相关的代码是:

import glob
import os
import sys
import csv
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdftypes import resolve1

path = 'C:\Users\Wonen\Downloads\Test'
for filename in glob.glob(os.path.join(path, '*.pdf')):
    fp = open(filename, 'rb')
    #read pdf's
    parser = PDFParser(fp)
    doc = PDFDocument(parser)
    #doc.initialize()    # <<if password is required
    fields = resolve1(doc.catalog['AcroForm'])['Fields']
    row = []
    for i in fields:
        field = resolve1(i)
        name, value = field.get('T'), field.get('V')
        #removing 'hidden enter'
        if value == None:
           print 'ok'
        elif value == NotImplementedError:
            print 'ok'
        elif '\n' in value:    
           value.replace('\n',' ')
        elif '\r' in value:    
           value.replace('\r',' ')
        row.append(value)
    writer.writerow(list(reversed(row)))

完整错误(+输出)是:
OK
确定

  

Traceback(最近一次调用最后一次):文件   &#34; C:\ Python27 \ Scripts \ test3.py&#34;,第37行,in       elif&#39; \ n&#39; in value:TypeError:类型&#39; PSLiteral&#39;的参数是不可迭代的

有谁知道如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

不知道输入文件的内容很难猜到。我认为问题在于,在调用field.get('V')来解决此问题时,您会获得一些非字符串值我建议您将value更改为字符串。 试试这样:

if value == None:
   print 'ok'
elif value == NotImplementedError:
    print 'ok'
elif '\n' in str(value):
   value = str(value)    
   value.replace('\n',' ')
elif '\r' in str(value):
   value = str(value)    
   value.replace('\r',' ')