Question

我是网络抓取的初学者，我正在使用BeautifulSoup。我试图抓取页面“https://www.codechef.com/rankings/ACMAMR15?filterBy=&order=asc&page=2&sortBy=rank”

（访问链接以了解查询）

我想要所有的链接：“acm15am1235”。我尝试使用soup.findAll（'a'），但无法识别要在soup.findAll中传递的标记类型，以便获取所需的链接。

请指出捕获所需链接的方法。

Answer 1

首先你必须得到包含链接的所有div

usernames = soup.find_all("div", class_="user-name")

下：

for username in usernames:
    print "link is :" + str(username.a['href'])