Question

我正在做一个CA而且我必须使用漂亮的汤来解析页面，我使用了代码

r = urlopen(url)    # download the page
res1 = str(r.read()) # put the content into a variable
soup = BeautifulSoup(res1,'html.parser')
for link in soup.find_all('a'):
    print(link.get('href'))

但是我必须打印已经抓取了多少个不同的页面。

有人有提示给我吗？

非常感谢

Answer 1

在评论中提及@ cricket_007时，您当前的代码会抓取＆＃39; （即检索）只有一页。

如果您需要打印文档中找到的链接数量，则可以执行

len

请注意，userInterfaceIdiom是相应代码的列表，因此switch UIDevice.currentDevice().userInterfaceIdiom { case .Phone: // It's an iPhone case .Pad: // It's an iPad case .Unspecified: // Something undefined }为您提供了许多链接。

如果你真的需要抓取网站（例如，检索页面，从这个页面获取所有链接，按照每个链接，检索它引用的页面等），我建议使用{{3而不是＆＃34;纯粹的＃34; BeautifulSoup。

使用Python和美丽的汤

1 个答案: