BeautifulSoup-find_all-返回空列表

时间:2019-08-04 06:16:52

标签: python-3.x beautifulsoup

我正在尝试抓取有关Udemy课程的网页。视频334 The Modern Python 3 Bootcamp

我正在看一个带有引号的页面,每个引号都有一个作者,ahref和引号。我需要将所有这些都放在列表中。

.select_all仅返回任何内容。如果使用.select可以正常工作,但是后来我无法“找到”我需要的东西,因为错误:AttributeError:'list'对象没有属性'find'(为什么->:*(> __>)< / p>

请在下面查看我的代码,并查看有效和无效之间的注释:

url = "http://quotes.toscrape.com"
url_next = "/page/1"
ori_url = requests.get(f"{url}{url_next}").text
every_thang = []

soup = BeautifulSoup(ori_url, "html.parser")
#all_the_quotes = soup.select(".quote") # this actually works, but cant use .find on it later
all_the_quotes2 = soup.find_all(".quote")

for q in all_the_quotes2:
    every_thang.append({
    "text": all_the_quotes2.find(".text").get_text(),
    "author": all_the_quotes2.find(".author").get_text(),
    "linky": all_the_quotes2.find("a")["href"]
    }) 

#for q in all_the_quotes: # gives error trying to use find
#    every_thang.append({
#    "text": all_the_quotes.find(".text").get_text(),
#    "author": all_the_quotes.find(".author").get_text(),
#    "linky": all_the_quotes.find("a")["href"]
#    }) 

print(all_the_quotes2)

2 个答案:

答案 0 :(得分:2)

使用findAll的正确方法是:

all_the_quotes2 = soup.find_all("div", {"class": "quote"})

答案 1 :(得分:1)

.select().find_all()的界面不同。 select()接受CSS选择器(list of all CSS selectors that BeautifulSoup 4.7.1+ supports),而不接受find_all()list of bs4 filters)。

要选择所有类别为"quote"的标签,您可以执行soup.find_all(class_="quote")

import requests
from bs4 import BeautifulSoup

url = "http://quotes.toscrape.com"
url_next = "/page/1"
ori_url = requests.get(f"{url}{url_next}").text
every_thang = []

soup = BeautifulSoup(ori_url, "html.parser")
all_the_quotes2 = soup.find_all(class_="quote")

every_thang = []
for q in all_the_quotes2:
    every_thang.append({
    "text": q.find(class_="text").get_text(),
    "author": q.find(class_="author").get_text(),
    "linky": q.find("a")["href"]
    })

from pprint import pprint
pprint(every_thang)

打印:

[{'author': 'Albert Einstein',
  'linky': '/author/Albert-Einstein',
  'text': '“The world as we have created it is a process of our thinking. It '
          'cannot be changed without changing our thinking.”'},
 {'author': 'J.K. Rowling',
  'linky': '/author/J-K-Rowling',
  'text': '“It is our choices, Harry, that show what we truly are, far more '
          'than our abilities.”'},
 {'author': 'Albert Einstein',
  'linky': '/author/Albert-Einstein',
  'text': '“There are only two ways to live your life. One is as though '
          'nothing is a miracle. The other is as though everything is a '
          'miracle.”'},
 {'author': 'Jane Austen',
  'linky': '/author/Jane-Austen',
  'text': '“The person, be it gentleman or lady, who has not pleasure in a '
          'good novel, must be intolerably stupid.”'},
 {'author': 'Marilyn Monroe',
  'linky': '/author/Marilyn-Monroe',
  'text': "“Imperfection is beauty, madness is genius and it's better to be "
          'absolutely ridiculous than absolutely boring.”'},
 {'author': 'Albert Einstein',
  'linky': '/author/Albert-Einstein',
  'text': '“Try not to become a man of success. Rather become a man of '
          'value.”'},
 {'author': 'André Gide',
  'linky': '/author/Andre-Gide',
  'text': '“It is better to be hated for what you are than to be loved for '
          'what you are not.”'},
 {'author': 'Thomas A. Edison',
  'linky': '/author/Thomas-A-Edison',
  'text': "“I have not failed. I've just found 10,000 ways that won't work.”"},
 {'author': 'Eleanor Roosevelt',
  'linky': '/author/Eleanor-Roosevelt',
  'text': '“A woman is like a tea bag; you never know how strong it is until '
          "it's in hot water.”"},
 {'author': 'Steve Martin',
  'linky': '/author/Steve-Martin',
  'text': '“A day without sunshine is like, you know, night.”'}]