选择器Scrapy Selenium不工作

时间:2016-11-24 22:56:15

标签: python selenium scrapy selector

我正在尝试废弃排名,让我们说https://www.shazam.com/charts/top-100/united-states

我正在使用python与selenium和scrapy,以下代码不打印任何东西。为什么呢?

sel=Selector(response) rank=sel.xpath('//span[@class="number"]/text()').extract() print(rank)

2 个答案:

答案 0 :(得分:0)

我不确定scrapy,但你可以使用selenium + beautifulsoup。

from selenium import webdriver
from bs4 import BeautifulSoup
import time


driver = webdriver.Chrome()
driver.maximize_window()
baseurl = "https://www.shazam.com/charts/top-100/united-states"
driver.get(baseurl)
time.sleep(2)
content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
rank = soup.findAll("span", {"class": "number"})
title=soup.findAll("a",{'class':'ellip'})
d=[x.text for x in rank]
t=[y.text for y in title]
for c,v in zip(d,t):
    print c,v

driver.quit()

它将打印:

01 Black Beatles
02 Rae Sremmurd Feat. Gucci Mane
03 Starboy
04 The Weeknd Feat. Daft Punk
05 Don't Wanna Know
06 Maroon 5 Feat. Kendrick Lamar
07 Bad Things
08 Closer
09 The Chainsmokers Feat. Halsey
10 Love On The Brain
11 Rihanna
12 i hate u, i love u
13 Scars To Your Beautiful
14 Alessia Cara
15 24k Magic
16 Bruno Mars
17 Fake Love
18 Drake
19 Caroline
20 Aminé
21 All Time Low
22 Jon Bellion
23 Let Me Love You
24 DJ Snake Feat. Justin Bieber
25 Gold
26 Kiiara
27 This Town
28 Niall Horan
29 Unsteady
30 X Ambassadors
31 Side To Side
32 Ariana Grande Feat. Nicki Minaj
33 Heathens
34 Twenty One Pilots
35 Starving
36 Hailee Steinfeld & Grey Feat. Zedd
37 In The Name Of Love
38 Martin Garrix Feat. Bebe Rexha
39 The Greatest
40 Sia
41 Do You Mind
42 DJ Khaled
43 Chill Bill
44 Rob $tone Feat. J. Davi$ & Spooks
45 Broccoli
46 D.R.A.M. Feat. Lil Yachty
47 Bounce Back
48 Big Sean
49 Ooouuu
50 Young M.A.
51 Fade
52 Kanye West
53 No Problem
54 Juju On The Beat (TZ Anthem)
55 Love Me Now
56 John Legend
57 What They Want
58 Russ
59 Blue Ain't Your Color
60 Keith Urban
61 Cheap Thrills
62 Sia
63 Pick Up The Phone
64 Young Thug & Travis Scott Feat. Quavo
65 Come And See Me
66 PARTYNEXTDOOR Feat. Drake
67 Mercy
68 Shawn Mendes
69 Bad And Boujee
70 Migos
71 Capsize
72 FRENSHIP & Emily Warren
73 Ain't My Fault
74 Zara Larsson
75 Bailar
76 Deorro Feat. Elvis Crespo
77 Luv
78 Tory Lanez
79 You Was Right
80 Lil Uzi Vert
81 Girlfriend
82 Kap G
83 Fresh Eyes
84 Andy Grammer
85 Way Down We Go
86 Kaleo
87 Key To The Streets
88 YFN Lucci Feat. Migos & Trouble
89 X
90 21 Savage & Metro Boomin Feat. Future
91 All Eyez
92 Better Man
93 Little Big Town
94 Play That Song
95 Train
96 Now And Later
97 Sage The Gemini
98 Used To This
99 Safari
100 Otw

答案 1 :(得分:0)

数据加载了一些JavaScript,但您不一定需要Selenium来获取数据。

使用您的浏览器开发工具&#34; Network&#34;选项卡,您应该看到https://www.shazam.com/shazam/v2/en/FR/web/-/tracks/web_chart_us的请求(或类似的,FR部分可能与您不同)。 对此的响应包含您作为JSON所需的所有数据。示例$ scrapy shell https://www.shazam.com/charts/top-100/united-states -s USER_AGENT='Mozilla' 2016-11-25 16:33:51 [scrapy] INFO: Scrapy 1.2.1 started (bot: scrapybot) (...) 2016-11-25 16:33:51 [scrapy] DEBUG: Crawled (200) <GET https://www.shazam.com/charts/top-100/united-states> (referer: None) (...) >>> fetch('https://www.shazam.com/shazam/v2/en/FR/web/-/tracks/web_chart_us') 2016-11-25 16:33:59 [scrapy] DEBUG: Crawled (200) <GET https://www.shazam.com/shazam/v2/en/FR/web/-/tracks/web_chart_us> (referer: None) >>> import json >>> from pprint import pprint >>> data = json.loads(response.text) >>> len(data['chart']) 100 >>> pprint(data['chart']) [{u'alias': u'black-beatles', u'artists': [{u'alias': u'rae-sremmurd', u'follow': {u'followkey': u'A_43974610'}, u'id': u'43974610'}], u'heading': {u'subtitle': u'Rae Sremmurd Feat. Gucci Mane', u'title': u'Black Beatles'}, u'images': {u'default': u'https://images.shazam.com/coverart/t326182348_s400.jpg'}, u'key': u'326182348', u'properties': {u'numberOfShazams': u'172270'}, u'share': {u'href': u'https://shz.am/t326182348', u'image': u'https://images.shazam.com/coverart/t326182348_s400.jpg', u'subject': u'Black Beatles - Rae Sremmurd Feat. Gucci Mane', u'text': u'I just used Shazam to discover Black Beatles by Rae Sremmurd Feat. Gucci Mane.', u'twitter': u'I just used Shazam to discover Black Beatles by Rae Sremmurd Feat. Gucci Mane.'}, u'stores': {u'apple': {u'actions': [{u'type': u'uri', u'uri': u'https://itunes.apple.com/fr/album/black-beatles-feat.-gucci/id1104984456?i=1104984917&uo=5&at=1001l4DI&ct=5348615A-616D-3235-3830-44754D6D5973&app=music&upsell=true'}], u'coverarturl': u'https://images.shazam.com/coverart/t326182348-i1104984917_s400.jpg', u'explicit': True, u'previewurl': u'http://audio.itunes.apple.com/apple-assets-us-std-000001/AudioPreview60/v4/37/fb/e5/37fbe552-71d3-22d1-2472-3183d9488eb8/mzaf_909765324228899658.plus.aac.p.m4a', u'productid': u'1104984456', u'trackid': u'1104984917'}, u'google': {u'actions': [{u'type': u'intent', u'uri': u'intent://play.google.com/store/music/album?id=Bivnnumjemykgrzbu4poevnlyte&tid=song-Tsrneq2ggvev7qtvnms5qdkpsxq&PAffiliateID=100l3pk#Intent;scheme=https;action=android.intent.action.VIEW;package=com.android.vending;end'}, {u'type': u'uri', u'uri': u'https://play.google.com/store/music/album?id=Bivnnumjemykgrzbu4poevnlyte&tid=song-Tsrneq2ggvev7qtvnms5qdkpsxq&PAffiliateID=100l3pk'}], u'coverarturl': u'https://images.shazam.com/coverart/t326182348-gTsrneq2ggvev7qtvnms5qdkpsxq_s400.jpg', u'previewurl': u'https://redirector.googlevideo.com/videoplayback?id=2b318fcf59107a39&itag=25&source=skyjam&begin=48000&len=28000&ratebypass=yes&ip=0.0.0.0&ipbits=0&expire=1484996802&sparams=id,itag,source,begin,len,ratebypass,ip,ipbits,expire&signature=7E52D544C5C5BEB3EF3EC984DBD3772660896DB0.BF855DD3332F2FBEB3CD1CB5AAE6DB84B98DF1C9&key=sj3', u'productid': u'Bivnnumjemykgrzbu4poevnlyte', u'trackid': u'Tsrneq2ggvev7qtvnms5qdkpsxq'}, u'itunes': {u'actions': [{u'type': u'uri', u'uri': u'https://itunes.apple.com/fr/album/black-beatles-feat.-gucci/id1104984456?i=1104984917&uo=5&at=11l3eE&ct=5348615A-616D-3235-3830-44754D6D5973&app=itunes'}], u'coverarturl': u'https://images.shazam.com/coverart/t326182348-i1104984917_s400.jpg', u'explicit': True, u'previewurl': u'http://audio.itunes.apple.com/apple-assets-us-std-000001/AudioPreview60/v4/37/fb/e5/37fbe552-71d3-22d1-2472-3183d9488eb8/mzaf_909765324228899658.plus.aac.p.m4a', u'productid': u'1104984456', u'trackid': u'1104984917'}, u'xboxmusic': {u'actions': [{u'type': u'uri', u'uri': u'http://clkde.tradedoubler.com/click?p=213961&a=2529806&g=0&url=http%3A%2F%2Fmusic.microsoft.com%2FTrack%2F8D6KGX0SHQR8%3Faction%3Dbuy'}], u'coverarturl': u'https://images.shazam.com/coverart/t326182348-xmusic.8D6KGX0SHQR8_s400.jpg', u'previewurl': u'http://progdownload.zune.net/165/990/909/170/audio.mp3?rid=xWwZn16cCkiqXZv0WtFQ6w.2.2', u'productid': u'music.8D6KGX0SHQRF', u'trackid': u'music.8D6KGX0SHQR8'}}, u'streams': {}, u'type': u'MUSIC', u'url': u'http://www.shazam.com/track/326182348/black-beatles', u'urlparams': {u'{trackartist}': u'Rae+Sremmurd', u'{tracktitle}': u'Black+Beatles'}}, (...) {u'alias': u'rivals', u'artists': [{u'alias': u'usher', u'follow': {u'followkey': u'A_14843'}, u'id': u'14843'}], u'heading': {u'subtitle': u'Usher Feat. Future', u'title': u'Rivals'}, u'images': {u'default': u'https://images.shazam.com/coverart/t328809516_s400.jpg'}, u'key': u'328809516', u'properties': {u'numberOfShazams': u'12225'}, u'share': {u'href': u'https://shz.am/t328809516', u'image': u'https://images.shazam.com/coverart/t328809516_s400.jpg', u'subject': u'Rivals - Usher Feat. Future', u'text': u'I just used Shazam to discover Rivals by Usher Feat. Future.', u'twitter': u'I just used Shazam to discover Rivals by Usher Feat. Future.'}, u'stores': {u'amazon': {u'actions': [{u'type': u'intent', u'uri': u'intent:#Intent;action=com.amazon.mp3.action.EXTERNAL_EVENT;S.com.amazon.mp3.extra.ALBUM_ASIN=B01KYREITM;S.com.amazon.mp3.extra.TRACK_ASIN=B01KYRF0NK;S.com.amazon.mp3.extra.EXTERNAL_EVENT_TYPE=com.amazon.mp3.type.SHOW_ALBUM_DETAIL;end'}, {u'type': u'uri', u'uri': u'http://www.amazon.fr/dp/B01KYRF0NK/?tag=shazaenterl09-21'}], u'coverarturl': u'https://images.shazam.com/coverart/t328809516-a0886445974379_s400.jpg', u'previewurl': u'http://www.amazon.fr/gp/dmusic/aws/sampleTrack.html?clientid=Shazam&ASIN=B01KYRF0NK', u'productid': u'B01KYREITM', u'trackid': u'B01KYRF0NK'}, u'apple': {u'actions': [{u'type': u'uri', u'uri': u'https://itunes.apple.com/fr/album/rivals-feat.-future/id1147225416?i=1147225579&uo=5&at=1001l4DI&ct=5348615A-616D-3235-3830-44754D6D5973&app=music&upsell=true'}], u'coverarturl': u'https://images.shazam.com/coverart/t328809516-i1147225579_s400.jpg', u'explicit': True, u'previewurl': u'http://audio.itunes.apple.com/apple-assets-us-std-000001/AudioPreview71/v4/09/25/92/0925928f-6cbb-261d-4799-86964cdabe3c/mzaf_9178904988224868382.plus.aac.p.m4a', u'productid': u'1147225416', u'trackid': u'1147225579'}, u'google': {u'actions': [{u'type': u'intent', u'uri': u'intent://play.google.com/store/music/album?id=Bmxtaeubdrwmky5q5t7bbn2hypi&tid=song-Tpxlav7onrvukibcoc3pmhpc4ge&PAffiliateID=100l3pk#Intent;scheme=https;action=android.intent.action.VIEW;package=com.android.vending;end'}, {u'type': u'uri', u'uri': u'https://play.google.com/store/music/album?id=Bmxtaeubdrwmky5q5t7bbn2hypi&tid=song-Tpxlav7onrvukibcoc3pmhpc4ge&PAffiliateID=100l3pk'}], u'coverarturl': u'https://images.shazam.com/coverart/t328809516-gTpxlav7onrvukibcoc3pmhpc4ge_s400.jpg', u'previewurl': u'https://redirector.googlevideo.com/videoplayback?id=db65ad4299ce0a70&itag=25&source=skyjam&begin=48000&len=28000&ratebypass=yes&ip=0.0.0.0&ipbits=0&expire=1482570527&sparams=id,itag,source,begin,len,ratebypass,ip,ipbits,expire&signature=99BE695BCFB2DAEC278AC1D024B93D85C48B7AD9.2B7B88C2191ED8B780728DF564436F7021C3426C&key=sj3', u'productid': u'Bmxtaeubdrwmky5q5t7bbn2hypi', u'trackid': u'Tpxlav7onrvukibcoc3pmhpc4ge'}, u'itunes': {u'actions': [{u'type': u'uri', u'uri': u'https://itunes.apple.com/fr/album/rivals-feat.-future/id1147225416?i=1147225579&uo=5&at=11l3eE&ct=5348615A-616D-3235-3830-44754D6D5973&app=itunes'}], u'coverarturl': u'https://images.shazam.com/coverart/t328809516-i1147225579_s400.jpg', u'explicit': True, u'previewurl': u'http://audio.itunes.apple.com/apple-assets-us-std-000001/AudioPreview71/v4/09/25/92/0925928f-6cbb-261d-4799-86964cdabe3c/mzaf_9178904988224868382.plus.aac.p.m4a', u'productid': u'1147225416', u'trackid': u'1147225579'}, u'xboxmusic': {u'actions': [{u'type': u'uri', u'uri': u'http://clkde.tradedoubler.com/click?p=213961&a=2529806&g=0&url=http%3A%2F%2Fmusic.microsoft.com%2FTrack%2F8D6KGX0RCKRN%3Faction%3Dbuy'}], u'coverarturl': u'https://images.shazam.com/coverart/t328809516-xmusic.8D6KGX0RCKRN_s400.jpg', u'previewurl': u'http://progdownload.zune.net/167/158/853/170/audio.mp3?rid=PswuU8LtZ0mb7KgC5z0WlQ.2.2', u'productid': u'music.8D6KGX0RCKTM', u'trackid': u'music.8D6KGX0RCKRN'}}, u'streams': {}, u'type': u'MUSIC', u'url': u'http://www.shazam.com/track/328809516/rivals', u'urlparams': {u'{trackartist}': u'Usher', u'{tracktitle}': u'Rivals'}}] 会话:

= default