如何使用BeautifulSoup4获取锚标记的“标题”?

时间:2018-02-08 21:35:57

标签: python beautifulsoup

我无法弄清楚如何在锚点上获得标题。 这是我的代码:

from flask import Flask
import requests
from bs4 import BeautifulSoup

laptops = 'http://webscraper.io/test-sites/e-commerce/allinone/computers/laptops'


def scrape():
    page = requests.get('http://webscraper.io/test-sites/e-commerce/allinone/computers/laptops')
    soup = BeautifulSoup(page.content, "lxml")
    links = soup("a", {"class":"title"})

    for link in links:
        print(link.prettify())


scrape()

结果示例:

<a class="title" href="/test-sites/e-commerce/allinone/product/251" title="Asus VivoBook X441NA-GA190">
 Asus VivoBook X4...
</a>

<a class="title" href="/test-sites/e-commerce/allinone/product/252" title="Prestigio SmartBook 133S Dark Grey">
 Prestigio SmartB...
</a>

<a class="title" href="/test-sites/e-commerce/allinone/product/253" title="Prestigio SmartBook 133S Gold">
 Prestigio SmartB...
</a>

我如何获得“头衔”?

1 个答案:

答案 0 :(得分:2)

title之类的属性可以通过订阅或元素上的.attrs字典访问:

for link in links:
    print(link['title'])

请参阅BeautifulSoup documentation on Attributes

对于给定的URL,这会产生:

Asus VivoBook X441NA-GA190
Prestigio SmartBook 133S Dark Grey
Prestigio SmartBook 133S Gold
Aspire E1-510
Lenovo V110-15IAP
Lenovo V110-15IAP
Hewlett Packard 250 G6 Dark Ash Silver
# ... etc