如何输出网页抓取脚本?

时间:2020-01-21 01:11:52

标签: python html web-scraping

下面的脚本试图抓取一个书本网站,但是什么也没显示,有什么建议吗?

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("https://www.bookbub.com/ebook-deals/fantasy-ebooks")
soup_packtpage = BeautifulSoup(html,features="lxml")

all_book_titles = soup_packtpage.find_all("div",class_="views-field-title")

for book_title in all_book_titles:
    book_title_span = book_title.span
    print("Title Name is :"+book_title_span.a.string)
    published_date = book_title.find_next("div",class_="views-field-field-date-of-publication-value")
    print("Published Date is :"+published_date.span.string)
    price = book_title.find_next("div",class_="views-field-sell-price")
    print("Price is :"+price.span.string)

1 个答案:

答案 0 :(得分:1)

有关书籍的数据通过Javascript动态加载。但是您可以使用requests模块加载json-feed并获取数据。

例如:

import json
import requests

category = 'fantasy'

url = 'https://www.bookbub.com/deals_api/books/latest?category={}&free=false&page=1'

data = requests.get(url.format(category)).json()

# print(json.dumps(data, indent=4)) # <-- uncomment this to see all data

for i, book in enumerate(data['books'], 1):
    print('{:<4} {} by {} ({})'.format(str(i)+'.', book['title'], book['authors'], book['dealPrice']))

打印:

1.   The Dark Lord by Jack Heckel ($0.99)
2.   Fool Moon by Jim Butcher ($1.99)
3.   A Drop of Magic by L. R. Braden ($0.99)
4.   The Dothan Chronicles: Complete Box Set by Charissa Dufour ($0.99)
5.   The Queen of All Crows by Rod Duncan ($1.99)
6.   The Deepest Blue by Sarah Beth Durst ($1.99)
7.   The Vine Witch by Luanne G. Smith ($1.99)
8.   Magic Burns by Ilona Andrews ($1.99)
9.   Ash Princess by Laura Sebastian ($1.99)
10.  The Scrivener’s Tale by Fiona McIntosh ($1.99)
11.  Dragonslayer by Duncan M. Hamilton ($2.99)
12.  Frey by Melissa Wright (Free!)
13.  Modern Magick: The Road to Farringale by Charlotte E. English (Free!)
14.  Witch’s Bell: Book One by Odette C. Bell (Free!)
15.  Graveyard Shift by Angela Roquet (Free!)
16.  Hidden Blade by Pippa DaCosta (Free!)
17.  Rise of the Dragons by Morgan Rice (Free!)
18.  A Throne for Sisters by Morgan Rice (Free!)
19.  Legacy of Hunger by Christy Nicholas (Free!)
20.  Mistress of Masks by C. Greenwood (Free!)
21.  A Quest of Heroes by Morgan Rice (Free!)
22.  Dragonlands: Volume 1–3 by Megg Jensen (Free!)
23.  Cobweb Bride by Vera Nazarian (Free!)
24.  Born of Water by Autumn M. Birt (Free!)
25.  The Second Sister by Rae D. Magdon (Free!)