Python:TypeError:期望的str,bytes或os.PathLike对象,而不是PdfFileReader

时间:2018-05-29 18:13:30

标签: python pdf

我有以下代码。这只是一个起点。后来我想替换静电" Hello Word"带有csv文件中的项目的文本,我读取并循环访问csv中的每个项目。 我希望每页都有水印。

# importing the required modules
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

def add_watermark(wmFile, pageObj):
    # opening watermark pdf file
    wmFileObj = open(wmFile, 'rb')

    # creating pdf reader object of watermark pdf file
    pdfReader = PyPDF2.PdfFileReader(wmFileObj)

    # merging watermark pdf's first page with passed page object.
    pageObj.mergePage(pdfReader.getPage(0))

    # closing the watermark pdf file object
    wmFileObj.close()

    # returning watermarked page object
    return pageObj


def main():
    import PyPDF2
    import io
    from reportlab.pdfgen import canvas
    from reportlab.lib.pagesizes import letter
    # watermark pdf file name
    packet = io.BytesIO()
    # Create a new PDF with Reportlab
    can = canvas.Canvas(packet, pagesize=letter)
    can.setFont('Helvetica-Bold',18)
    can.drawString(10, 100, "Hello world")
    can.showPage()
    can.save()

    # Move to the beginning of the StringIO buffer
    packet.seek(0)
    mywatermark = PyPDF2.PdfFileReader(packet)

    # original pdf file name
    origFileName = 'Module1.pdf'

    # new pdf file name
    newFileName = 'watermarked_example.pdf'

    # creating pdf File object of original pdf
    pdfFileObj = open(origFileName, 'rb')

    # creating a pdf Reader object
    pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

    # creating a pdf writer object for new pdf
    pdfWriter = PyPDF2.PdfFileWriter()

    # adding watermark to each page
    for page in range(pdfReader.numPages):
        # creating watermarked page object
        wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))

        # adding watermarked page object to pdf writer
        pdfWriter.addPage(wmpageObj)

    # new pdf file object
    newFile = open(newFileName, 'wb')

    # writing watermarked pages to new file
    pdfWriter.write(newFile)

    # closing the original pdf file object
    pdfFileObj.close()
    # closing the new pdf file object
    newFile.close()


if __name__ == "__main__":
    main()

我得到的错误是:

Traceback (most recent call last):
  File "watermark.py", line 101, in <module>
    main()
  File "watermark.py", line 83, in main
    wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
  File "watermark.py", line 32, in add_watermark
    wmFileObj = open(wmFile, 'rb')
TypeError: expected str, bytes or os.PathLike object, not PdfFileReader

我相信我明白了它所期待的字符串,字节或文件,我不会写,它只是一个&#34;对象&#34;。< / p>

我尝试过几件事,但无论我尝试什么,都会让事情变得更糟: - (

有人可以帮忙吗?我非常肯定这只是一件小事,因为我很擅长监督那些显而易见的事情。

感谢任何帮助。

感谢

1 个答案:

答案 0 :(得分:1)

我会将指南和不完整的内容留到最后,以下是修复这段代码的方法:

1)将变量'packet'设置为脚本所在目录中的现有PDF文件文件名:

packet = 'my_watermark.pdf'

2)删除移动到'stringIO'缓冲区的开头(就像我们需要的那样):

packet.seek(0)     # delete this
mywatermark = PyPDF2.PdfFileReader(packet) #delete this too

3)在for-loop块中将'packet'作为参数而不是'mywatermark':

wmpageObj = add_watermark(packet, pdfReader.getPage(page))

4)从add_watermark函数删除文件的开始和结束,只留下PdfFileReader实例的构造,但是使用参数'wmFile':

wmFileObj = open(wmFile, 'rb')                # delete this
pdfReader = PyPDF2.PdfFileReader(wmFile)      # let this be, but change wmFileObj to wmFile
pageObj.mergePage(pdfReader.getPage(0))       # let this be
wmFileObj.close()                             # delete this
return pageObj                                # let this be  

此外,在您的代码中,主函数中有导入,将它们移动到文件的开头,并阅读一些文档。 PyPDF2的文档显示了如何合并页面(这是模块的专业知识),虽然它有点简洁,但另一方面,Reportlab的用户指南非常详尽,但很简单。始终尝试在代码后面看到含义。