Question

因此，我要从中抓取的网站是：https // viewyourdeal-gabrielsimone.com'

产品名称和价格在每个div类下=“ info-wrapper” 我可以毫无问题地提取价格，但是，当我尝试提取产品标题时，它无法将其转换为文本作为href链接。每个产品名称都在href之下的div类下。所以我的问题是，如何刮取产品名称？

import json
from bs4 import BeautifulSoup
import requests 
import csv
from datetime import datetime

url = 'https://viewyourdeal-gabrielsimone.com'

gmaInfo=[]
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
for info in content.findAll('div', attrs={"class" : "wrapper ease-animation"}):
    gridObject = {
            "title" : info.find('div', attrs={"class" : "title animation allgrey"}),
            "price" : info.find('span', attrs={"class":"red-price"}).text
            }
    print(gridObject)
    with open('index.csv', 'w') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow([gridObject])

Answer 1

使用以下代码，几乎没有任何项返回None。只需提供If条件（如果元素存在，则获取文本）。

{{1}}

Answer 2

我对div类太具体了，我将该类更改为简单的title，并且效果很好。

如何使用BeautifulSoup抓取超链接标题？

2 个答案: