我正在尝试从此链接 https://www.goodricketea.com/product/darjeeling-tea 获取每个产品的单独 URL 链接 .我应该如何用beautifulsoup做到这一点?有谁能帮帮我吗?
答案 0 :(得分:3)
要从此站点获取产品链接,您可以执行以下操作:
import requests
from bs4 import BeautifulSoup
url = "https://www.goodricketea.com/product/darjeeling-tea"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
for a in soup.select("a:has(>h2)"):
print("https://www.goodricketea.com" + a["href"])
打印:
https://www.goodricketea.com/product/darjeeling-tea/roasted-darjeeling-tea-250gm
https://www.goodricketea.com/product/darjeeling-tea/thurbo-darjeeling-tea-whole-leaf-250gm
https://www.goodricketea.com/product/darjeeling-tea/roasted-darjeeling-tea-organic-250gm
https://www.goodricketea.com/product/darjeeling-tea/roasted-darjeeling-tea-100gm
https://www.goodricketea.com/product/darjeeling-tea/thurbo-darjeeling-tea-whole-leaf-100gm
https://www.goodricketea.com/product/darjeeling-tea/thurbo-darjeeling-tea-fannings-250gm
https://www.goodricketea.com/product/darjeeling-tea/castleton-premium-muscatel-darjeeling-tea-100gm
https://www.goodricketea.com/product/darjeeling-tea/castleton-vintage-darjeeling-tea-250gm
https://www.goodricketea.com/product/darjeeling-tea/castleton-vintage-darjeeling-tea-100gm
https://www.goodricketea.com/product/darjeeling-tea/castleton-vintage-darjeeling-tea-bags-50-tea-bags
https://www.goodricketea.com/product/darjeeling-tea/castleton-vintage-darjeeling-tea-bags-100-tea-bags
https://www.goodricketea.com/product/darjeeling-tea/badamtam-exclusive-organic-darjeeling-tea-250gm
https://www.goodricketea.com/product/darjeeling-tea/badamtam-exclusive-organic-darjeeling-tea-100gm
https://www.goodricketea.com/product/darjeeling-tea/seasons-3-in-1-darjeeling-leaf-tea-150gm-first-flush-second-flush-pre-winter-flush