您好,由于某些原因,我似乎无法使用BS从https://www.gbgb.org.uk/抓取任何结果数据,我可以使用prettify打印我想要的结果页面页面,但是例如,当我要求输入“ find_all”时,返回值为0时,有人可以看到我在做错什么吗,因为相同的代码在其他站点上也可以正常工作,下面是我的意思的快速示例,非常感谢
import urllib.request
import urllib.parse
from requests import get
url = 'https://www.gbgb.org.uk/meeting/?meetingId=355490&raceId=577749'
response = get(url)
#print(response.text[:500])
headers = {}
headers['User-Agent'] ="Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
req = urllib.request.Request(url, headers = headers)
from bs4 import BeautifulSoup
html_soup = BeautifulSoup(response.text, 'html.parser')
type(html_soup)
#print(html_soup.prettify())
info_container = html_soup.find_all('div', class_ = 'MeetingRaceTrap')
print(type(info_container))
print(len(info_container))
答案 0 :(得分:1)
如果转到NetWork
标签。您将获得以下API,该API以json格式返回结果。
https://api.gbgb.org.uk/api/results/meeting/355490?meeting=355490
您在这里不需要BeautifulSoup。
import requests
import json
headers = {'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
url = 'https://api.gbgb.org.uk/api/results/meeting/355490?meeting=355490'
response =requests.get(url,headers=headers)
data=json.loads(response.text)
print(data)
现在,假设您要获得races
只需打印
print(data[0]['races'])
或者您想获得比赛奖品。
for price in data[0]['races']:
print(price['racePrizes'])
您的输出将是
1st £95 | Others £40 | Race Total £95
1st £95 | Others £40 | Race Total £295
1st £105 | Others £40 | Race Total £305
1st £100 | Others £40 | Race Total £300
1st £120 | Others £40 | Race Total £320
1st £110 | Others £40 | Race Total £310
1st £110 | Others £40 | Race Total £310
1st £115 | Others £40 | Race Total £315
1st £120 | Others £40 | Race Total £320
1st £105 | Others £40 | Race Total £305
要获取所有狗名,您需要迭代父元素。
import requests
import json
headers = {'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
url = 'https://api.gbgb.org.uk/api/results/meeting/355490?meeting=355490'
response =requests.get(url,headers=headers)
data=json.loads(response.text)
for d in data[0]['races']:
for dog in d['traps']:
print(dog['dogName'])
这将打印所有60个名称。
Talking Lulu
Demolition Dolly
Holycross Jo Jo
Fieldview Gramps
Fieldview Darcie
Blackrose Frog
Kilbreedy Gaga
Yorkstreet Milly
Blackrose Angus
Greencroft Snowy
Marcos Veggera
Ramors Flash
Dan The Tail
Killinan Fairy
Knockalton Bella
Howl At The Moon
Westmead Boss
Rockhill Romeo
Fieldview Gem
Only One Ding
Fieldview Jet
Leazes Samuel
Glassmoss Sally
Fieldview Franky
Talamh Dochais
Greencroft Spot
Greencroft Jed
Footfield Bee
Hather Pixie
Makeit My Dog
Makeit Mos Bro
Droopys Cristina
Puckane Panda
Hollywood Coco
Fieldview Dolly
Ballyphilip Bill
Bees Charm
Crossfield Hal
Savana Jody
Savana Hottie
Greencroft Briny
Savana Dan Dan
Savana Diamond
Savana Schnappes
Savana Pegasus
Millroad Captian
Savana Pimms
Ballyhoe Vouga
Fieldview Myles
Hollander
Savana Tequila
Ballygibba Chip
Rockburst Tess
All About Will
Clockwork Girl
Roma Lady
Fieldview Pancho
Harry Boy
Rahyvira Lady
Cobblers Girl