Question

我有一个网站，我经常试图抓。我抓住了我想要的所有内容，然而，在soup.find_all中有太多项目（即使在尝试指定＆＃39; span＆＃39;和class_ =之后）。

a = soup.find_all('span', class_=re.compile("headline")

其中len（a）= 500.如何对逻辑进行编程，使得我只能抓住前10个标题，而不是500？看起来抓住所有500个导致我的程序滞后，这不是理想的。

Answer 1

根据Beautiful Soup DOcs

尝试使用limit参数

soup.find_all('title', limit=1)
# [<title>The Dormouse's story</title>]