初学者网站抓取代码迭代问题

时间:2020-10-30 06:27:42

标签: python web-scraping

我是Python的新手,非常感谢您的帮助!

我一直在尝试创建一个字典,以将书分配给他们的作者,只是为了使它变得凌乱并自我重复。

我该如何解决?

import requests
from bs4 import BeautifulSoup

url = "https://www.banyen.com/new-arrivals/index.html"
response = requests.get(url)
html = response.content
scraped = BeautifulSoup(html,'html.parser')
results = []

article = scraped.find("div", class_="block block-system block-odd clearfix")
for i in article.find_all():
    name = i.find("h2", "a href", class_="teaser-title")
    author = i.find("span", class_="price-amount")
    if name is not None:
        if author is not None:
          results.append({name:author})

print(results)

1 个答案:

答案 0 :(得分:0)

import requests
from bs4 import BeautifulSoup
import re

url = "https://www.banyen.com/new-arrivals/index.html"
response = requests.get(url)
html = response.content
scraped = BeautifulSoup(html,'html.parser')
results = []

articles = scraped.find_all("div", id=re.compile("node-"))
for i in articles:
    name = i.find("h2").find('a')
    author = i.find("span", class_="price-amount")
    if name is not None:
        if author is not None:
            results.append({name.text.strip():author.text})

print(results)