Question

我有一些python 3.5代码，我想用它来抓取网页的一部分，而不是打印＆＃34;厚和耐嚼的花生酱巧克力芯片＆＃34;它打印＆＃34;无＆＃34;。你知道为什么吗？谢谢。

import requests, bs4
import tkinter as tk
from tkinter import *
import pymysql
import pymysql.cursors

res = requests.get("http://www.foodnetwork.co.uk/article/traybake-recipes/thick-and-chewy-peanut-butter-chocolate-chip-bars/list-page-2.html")
res.raise_for_status()
recipeSoup = bs4.BeautifulSoup(res.text, "html.parser")
type(recipeSoup)
instructions = recipeSoup.find("div", itemprop="name")
try:
    method = str.replace(instructions.get_text(strip=True),". ",".")
    method = str.replace(method, ". ", ".")
    method = (str.replace(method, ".",".\n"))
except AttributeError:
    print(instructions)

Link to scraped page

Answer 1

将instructions = recipeSoup.find("div", itemprop="name")更改为instructions = recipeSoup.find("span", itemprop="name")以获取食谱标题。

根据说明，您必须使用li搜索itemprop=ingredients个代码。

BeautifulSoup scrape itemprop =＆＃34; name＆＃34;在Python中

1 个答案: