为什么在将PDF文件读入Python时,pdf流对象是否过早终止?为什么我可以在记事本等中打开pdf并使整个文件显示正常,但不能在Python中显示?我知道pypdf,pdfminer等。这纯粹是为了帮助我理解幕后发生的事情。
fp = open(r'E:Books\glu.pdf')
foo = fp.read()
print(foo)
%PDF-1.2
%âãÏÓ
%header, xref, etc omitted for clarity.
34 0 obj
<<
/Type /Catalog
/Pages 31 0 R
>>
endobj
69 0 obj
<< /S 274 /Filter /FlateDecode /Length 70 0 R >>
stream
Hノb```f``éb`c`àd`àc@
%Stream is much longer, but always terminates prematurely.
答案 0 :(得分:0)
以二进制模式打开pdf文件:
fp = open(r'E:Books\glu.pdf','rb')
foo = fp.read()
print(foo)
#and don't forget to close the file
fp.close()