from urllib.request import urlopen
from bs4 import BeautifulSoup
content = urlopen("http://en.wikipedia.org/wiki/List_of_human_stampedes")
soup = BeautifulSoup(content)
print(soup.get_text())
print(soup.prettify())
错误:
Traceback (most recent call last):
File "C:\Users\sony\Desktop\Trash\Crawler Try\try3.py", line 5, in <module>
print(soup.get_text())
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 10487: character maps to <undefined>
[Finished in 2.1s with exit code 1]
似乎是特定于页面的例如。我得到这个,以防http://www.quora.com
替换网址