Python - 如何获取所有实例而不仅仅是页面上的第一个实例

时间:2015-09-28 01:25:29

标签: python web-scraping beautifulsoup

使用findAll给出错误" TypeError:list indices必须是整数,而不是str",其中使用.find没有。使用findall会出现错误" TypeError:' NoneType'对象不可调用"。

定位具有" frame"类的所有链接的正确方法是什么?在页面上,而不仅仅是第一个实例?

import requests
from bs4 import BeautifulSoup

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/2/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/3/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/4/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/5/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/6/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/7/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/8/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

url = ("http://www.gym-directory.com/listing-category/gyms-fitness-centres/page/9/")
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
print soup.findAll("a",{"class":"frame"})["href"]

1 个答案:

答案 0 :(得分:5)

问题是soup.findAll()会返回list,而您尝试使用["href"]

访问该列表

您需要做的是:

for elem in soup.findAll("a", {"class": "frame"}):
    print elem["href"]