运行python代码时遇到问题:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
#url1='https://www.nytimes.com/store/west-side-highway-and-piers-manhattan-1937-nypl482645-nypl482645p.html'
url2='https://www.nytimes.com/1978/06/21/archives/jordan-wary-of-interim-role-in-west-bank-and-gaza-jordan-accepted.html'
response = requests.get(url, headers=headers)
fileout="outputTest.html"
obj=open(fileout,"w")
obj.write(response.text)
obj.close()
使用url2时从URL下载HTML并显示错误(适用于url1)。
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 34060: character maps to <undefined>
如何修复url2的错误?
答案 0 :(得分:0)
使用
obj.write(str(response.text.encode('utf-8')))
而不是
obj.write(response.text)