Question

我正在尝试提取wedpage中的所有url并将所有这些url放入列表中。但是，当我运行代码时，它会显示一条错误消息："tag[key] returns the value of the 'key' attribute for the tag, and throws an exception if it's not there."我想知道如何解决此问题。我的代码如下：

import urllib.request
from bs4 import BeautifulSoup

r = 'https://stackoverflow.com/'
openedUrl = urllib.request.urlopen(r)

soup = BeautifulSoup(openedUrl, 'lxml')

aa = soup.find_all('a')
href = []
for a in aa:
    href.append(a['href'])

print(href)

Answer 1

问题是某些'a'标记没有'href'属性，因此当您尝试访问KeyError时，python会抛出a['href']异常。

如果将关键字参数href设置为True，则可以避免这种情况。

aa = soup.find_all('a', href=True)

从标记属性访问密钥时，最好使用get方法，因为如果密钥不存在则返回None，因此不会引发异常。

无法将URL放入列表（BeautifulSoup）

1 个答案: