Python web scrape(requests,BeautifulSoup)

时间:2015-11-19 16:44:11

标签: python beautifulsoup python-requests

我正在尝试编写一个简单的Web scrape脚本,所以我编写了这段代码并且出错了。

import requests
from bs4 import BeautifulSoup

r = requests.get('http://the website that I need.com')

soup = BeautifulSoup(r.content)

print(soup.prettify())

我收到一个错误说:

Traceback (most recent call last):
  File "course.py", line 18, in <module>
    print(soup.prettify())
  File "C:\Python34\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u203a' in position
32558: character maps to <undefined>

我正在使用Python 3.4.0

所以有人能说出发生了什么吗?

1 个答案:

答案 0 :(得分:-1)

我相信这是一个Encode问题:尝试在返回字符串上添加一个编码类型:

编码为UTF-8的例子 汤= BeautifulSoup(r.content.encode(&#39; uft-8&#39;))