用beautifulsoup attrs拉出href

时间:2018-08-24 16:32:53

标签: python beautifulsoup

我正在尝试一些新的方法,以提取a标记中的所有href。它并没有退出hrefs,也找不到原因。

import requests
from bs4 import BeautifulSoup

url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')

for href in soup.findAll('a'):
    h = href.attrs['href']
    print(h)

1 个答案:

答案 0 :(得分:2)

您应该检查键是否存在,因为它可能在<a>标签之间也没有href。

import requests
from bs4 import BeautifulSoup

url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text, 'html.parser')

for a in soup.findAll('a'):
    if 'href' in a.attrs:
        print(a.attrs['href'])