这是代码
#!/usr/bin/python
import codecs
import urllib.request
resp = urllib.request.urlretrieve('http://normanpd.normanok.gov/filebrowser_download/657/2017-02-16%20Daily%20Incident%20Summary.pdf', 'test.pdf')
with codecs.open("test.pdf") as f:
for line in f:
line.decode('utf-8')
print(line)
执行上述代码后,我收到如下错误
Traceback (most recent call last):
File "normanpd.py", line 6, in <module>
for line in f:
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 11: invalid start byte
请帮助我解决此问题。
答案 0 :(得分:0)
是什么让您认为该文件是编码字符串?它根本不是一个字符串; pdf不可读,它是二进制格式。你不能只是迭代并打印出来。