Question

我尝试使用类似的代码抓取一个网站

import requests
requests.get("myurl.com").content

但是网站上缺少一些重要的元素。如何像使用Firefox或其他浏览器中的检查器一样，使用Python 3获得整个网站内容？

Answer 1

您为什么不尝试Scrapy，Selenium甚至Splash？它们是强大的抓取库。

Answer 2

为此，您可以使用Beautiful Soup（一个用于抓取的python库）。只需将其导入顶部：

from bs4 import BeautifulSoup

然后，将这些行添加到您的代码中

data = requests.get("myurl.com").text
soup = BeautifulSoup(data, 'html.parser')