Question

我使用python下载html文件并存储在文件中。这是代码：

url = "http://www.nytimes.com/roomfordebate/2014/09/24/protecting-student-privacy-in-online-learning"
page = requests.get(url) 
# save html content
file_name = url.split('/')[-1]
text_file = open(file_name, 'w+')
text_file.write(page.text())
text_file.close()

我收到以下错误：在scrape_Page中的文件＆＃34; scraper.py＆＃34;，第15行 text_file.write（page.text（）） TypeError：＆＃39; unicode＆＃39;对象不可调用

有谁能告诉我怎样才能成功存储文本或为什么会出现此错误？感谢

Answer 1

request.text是属性，不是方法。你不应该叫它。您也不应该使用它来下载文件，而应该使用.content代替;你想要未解码的字节，而不是解码的Unicode值：

text_file.write(page.content)

要下载内容，您可能希望将其流式传输到文件中：

import requests
import shutil

r = requests.get(url, stream=True)
file_name = url.rpartition('/')[-1]
with open(file_name, 'wb') as f:
    r.raw.decode_content = True
    shutil.copyfileobj(r.raw, f)

在python中，下载html文件并存储在一个文件中

1 个答案: