Question

我正在尝试返回页面上所有文章的预告片标题。无论我搜索哪个页面，我都会收到以下代码，我收到

流程已完成退出代码0 ，仅此而已。

有人可以告诉我哪里出错了。我在PyCharm 2016.3.2和Anaconda3中使用。

由于

import requests
from bs4 import BeautifulSoup

  if __name__ == "__main__":
    # User agent to bypass scraping security
    agent = {'User-Agent': 'Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405'}
    req = requests.get("http://www.zerohedge.com/", agent)

    #req.content = html page source and we are using the html parser
    soup = BeautifulSoup(req.content, "html.parser")

    for i in soup.find_all("title teaser-title"):
        print(i.text)

Answer 1

您需要指定要搜索的标记以及可选的类。像这样：

soup.find_all("h2", class_="title teaser-title")

或使用cssselector：

soup.select("h2[class='title teaser-title']")

使用requests.get和BeautifulSoup从页面返回标题

1 个答案: