Question

代码：

from urllib import request
response = request.urlopen('http://www.amazon.com/')
body = response.read()
with open('test.html', 'wb') as f:
   f.write(body)
with open('test2.html', 'w') as f:
  f.write(body.decode('utf-8'))

任何差异或任何需要注意的事项？

Answer 1

第一种方式

with open('test.html', 'wb') as f:
   f.write(body)

只需保存您下载的二进制数据。

第二种方式

with open('test2.html', 'w') as f:
  f.write(body.decode('utf-8'))

假设数据为UTF-8，尝试将这些UTF-8字节解码为Unicode文本，然后将其重新编码为默认文件编码，如locale.getpreferredencoding(False)所指定。因此，如果数据已经 UTF-8，则会浪费地对其进行解码和重新编码。如果它不 UTF-8，那么它指定了错误的编码来解码它。如果文件只包含普通的7位ASCII数据，那么这将正常工作，否则会产生错误的结果，或者引发UnicodeDecodeError。

将html解码为str文件并将html二进制数据直接写入文件之间有什么区别？

1 个答案: