Python3 UnicodeEncodingError

时间:2018-05-14 13:16:38

标签: python python-3.x character-encoding

from selenium import webdriver
from bs4 import BeautifulSoup

driver = webdriver.Chrome(executable_path = 
r'C:\chromedriver_win32\chromedriver.exe')

driver.get('https://www.imdb.com/')

html_doc = driver.page_source

soup = BeautifulSoup(html_doc, 'lxml')
print(soup.prettify())

driver.quit()

我尝试了这段代码并且出现了这个错误。

追踪(最近一次通话):   文件“E:\ Practice \ WebScrapping \ webscrap.py”,第11行,in     打印(soup.prettify())   文件“C:\ Users \ vmbck \ AppData \ Local \ Programs \ Python \ Python36 \ lib \ encodings \ cp1252.py”,第19行,编码     return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError:'charmap'编解码器无法对位置241524中的字符'\ u25ec'进行编码:字符映射到

然后我尝试使用encode(“utf-8”)

html_doc = driver.page_source.encode("utf-8")

再次出现错误

如何在不获取UnicodeEncodeError的情况下获取page_source

2 个答案:

答案 0 :(得分:0)

import requests
from bs4 import BeautifulSoup
a = requests.get('https://www.imdb.com/')
soup = BeautifulSoup(a.content, 'lxml')
print(soup.prettify())

上面的代码与你所写的类似。但是,要解决unicode错误,您可以尝试执行以下帖子中建议的内容 Python Unicode Encode Error

答案 1 :(得分:-1)

如果编码为utf-8失败,请尝试编码为ascii

尝试两种方法: -

print(soup.encode('utf-8').prettify())

print(soup.encode('ascii').prettify())