Question

我正在尝试将网站抓取为字符串，但是当我在我的字节对象上使用encode（“ utf-8”）时，它不会返回字符串，而是得到了UnicodeEncodeError。我正在尝试抓取这个网站：https://www.futbin.com/20/player/24248/leon-goretzka，我知道它使用charset =“ utf-8”。


from bs4 import BeautifulSoup

r = requests.get("https://www.futbin.com/20/player/24248/leon-goretzka")

text = r.text.encode("utf-8")

html = text.decode("utf-8")


print(html)

Answer 1

get的{{1}}函数需要一个实际的链接。在您的示例中，您要提供字符串requests。

"link"

这为r = requests.get("https://www.futbin.com/20/player/24248/leon-goretzka") data = r.text print(data)提供了一个Response对象。使用r将为您提供字符串，r.text将为您提供字节（需要解码）。

以下是参考链接：Response example

网页抓取时进行编码/解码

1 个答案: