我如何刮取其他标签?

时间:2019-12-22 21:01:39

标签: python web-scraping beautifulsoup

我正在尝试使用Bs4抓取Sportium odds

问题是bs4仅抓住了足球赔率选项卡,我希望获得所有运动项目的赔率,我希望有人可以帮助我解决这个问题。

这是我的代码:

url="https://sports.sportium.es/es/apuestasparahoy"
try:
 page = urllib.request.urlopen(url)
except:
 print("An error occured.")
soup = BeautifulSoup(page, "html.parser")

1 个答案:

答案 0 :(得分:2)

页面通过Javascript / Ajax动态加载数据。但是,如果您打开Firefox / Chrome开发人员工具,则会看到页面在哪里以及如何发出请求。

此示例将从每个选项卡打印数据:

import requests
from bs4 import BeautifulSoup

main_url = 'https://sports.sportium.es/es/apuestasparahoy'
url = 'https://sports.sportium.es/web_nr'

frag = BeautifulSoup(requests.get(main_url).text, 'html.parser').select_one('.fragment.inplay.expander[data-frag_desc][data-src_code="UPCOMING"]')['data-frag_desc']

data = {
    "key":"CMS.web.cms_handlers.update_fragments",
    "frag": frag,
    "play_mode":"F",
}

soup = BeautifulSoup(requests.post(url, data=data).json()[0], 'html.parser')

sports = {'FOOT':'Football',
'BASK':'Basketball',
'BASE':'Baseballl',
'CRIC':'Cricket',
'DART':'Darts',
'ESPS':'E-Sports',
'AMFO':'American Football',
'ICEH':'Ice Hockey',
'VOLL':'Volleyball'}

for div in soup.select('div[id^="upcoming-tab"]'):

    print('Sport :', sports[div['id'].replace('upcoming-tab-', '')])

    for tr in div.select('tr'):
        print(tr.get_text(separator='|', strip=True).split('|')[1:])

    print('-' * 80)

打印:

Sport : Football
['23:00', '22 Dic', 'Humble Lions', '6/5', '2.20', '+120', 'X', '9/5', '2.80', '+180', 'Harbour View', '9/4', '3.25', '+225', '+34', 'st']
['00:30', '23 Dic', 'Blooming Santa Cruz', '4/11', '1.36', '-275', 'X', '7/2', '4.60', '+350', 'Guabira Montero', '11/2', '6.50', '+550', '+40', 'st']
['01:00', '23 Dic', 'Mount Pleasant FA', '4/6', '1.66', '-150', 'X', '9/4', '3.25', '+225', 'Cavalier', '7/2', '4.50', '+350', '+35', 'st']
['02:00', '23 Dic', 'Arnett Gardens', '5/6', '1.83', '-120', 'X', '21/10', '3.10', '+210', 'Dunbeholden FC', '3/1', '4.00', '+300', '+35', 'st']
['17:00', '23 Dic', 'PAOK', '2/9', '1.22', '-450', 'X', '17/4', '5.25', '+425', 'Atromitos Athinon', '11/1', '12.00', '+1100', '+38', 'st']
['17:00', '23 Dic', 'Giresunspor', '10/11', '1.90', '-110', 'X', '21/10', '3.10', '+210', 'Altinordu', '13/5', '3.60', '+260', '+33', 'st']
['18:00', '23 Dic', 'Atiker Konyaspor 1922', '17/10', '2.70', '+170', 'X', '2/1', '3.00', '+200', 'Trabzonspor', '7/5', '2.40', '+140', '+151', 'st']
['18:00', '23 Dic', 'Denizlispor', '23/10', '3.30', '+230', 'X', '21/10', '3.10', '+210', 'Alanyaspor', '21/20', '2.05', '+105', '+130', 'st']
['20:45', '23 Dic', 'Blackburn', '4/5', '1.80', '-125', 'X', '5/2', '3.50', '+250', 'Wigan', '10/3', '4.40', '+333', '+152', 'st']
--------------------------------------------------------------------------------
Sport : Basketball
['23:00', '22 Dic', 'TCU', '13/20', '1.65', '-154', 'Xavier', '21/20', '2.05', '+105', '+3']
['23:00', '22 Dic', 'San Jose State', '13/10', '2.30', '+130', 'Cal Riverside', '8/15', '1.53', '-188', '+3']
['23:00', '22 Dic', 'Boise State', '8/11', '1.72', '-138', 'Georgia Tech', '19/20', '1.95', '-106', '+3']
['23:00', '22 Dic', 'Fuerza Regia de Monterrey', '1/10', '1.10', '-1000', 'Correcaminos UAT Victoria', '5/1', '6.00', '+500', '+7', 'st']

... and so on.