使用requests.get和BeautifulSoup从页面返回标题

时间:2017-02-28 11:39:16

标签: python-3.x beautifulsoup python-requests anaconda

我正在尝试返回页面上所有文章的预告片标题。无论我搜索哪个页面,我都会收到以下代码,我收到

流程已完成退出代码0 ,仅此而已。

有人可以告诉我哪里出错了。我在PyCharm 2016.3.2和Anaconda3中使用。

由于

import requests
from bs4 import BeautifulSoup

  if __name__ == "__main__":
    # User agent to bypass scraping security
    agent = {'User-Agent': 'Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405'}
    req = requests.get("http://www.zerohedge.com/", agent)

    #req.content = html page source and we are using the html parser
    soup = BeautifulSoup(req.content, "html.parser")

    for i in soup.find_all("title teaser-title"):
        print(i.text)

1 个答案:

答案 0 :(得分:1)

您需要指定要搜索的标记以及可选的类。像这样:

soup.find_all("h2", class_="title teaser-title")

或使用cssselector

soup.select("h2[class='title teaser-title']")