BeautifulSoup不会在页面来源上返回真实文本

时间:2019-12-04 14:02:32

标签: python web-scraping beautifulsoup python-3.7

我正在尝试使用 requests BeautifulSoup 从livecore.com抓取足球比赛结果。由于某些原因,它不是团队名称和得分而是返回以下内容:

SELECT AVG(TIME_TO_SEC(timediff(action.created_at, customer.created_at)) / 60) AS diff
    FROM action
     WHERE action.type = "call0"
    JOIN customer ON action.customer_id = customer.id
    GROUP BY customer.id
    ORDER BY action.created_at ASC

我的代码:

03-12-2019 - __home_team__ - __home_score__ - __away_team__ - __away_score__

源代码:

import requests
from bs4 import BeautifulSoup
from datetime import date, timedelta

yesterday = date.today() - timedelta(days=1)
checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
url = 'https://www.livescore.com/soccer/' + checkDate
playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'

response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

home = soup.find_all('div', class_='ply tright name')
away = soup.find_all('div', class_='ply name')
hScore = soup.find_all('span', class_='hom')
aScore = soup.find_all('span', class_='awy')

with open('Scores.csv', 'a') as f:
    for h, a, hs, aws in zip(home, away, hScore, aScore):
        f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
        print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)

我尝试过的事情:

1。)获取'a'标签(不返回任何内容)

2。)使用<a href="/soccer/england/premier-league/crystal-palace-vs-afc-bournemouth/6-18427820/" class="match-row scorelink even " data-type="evt" data-id="soccer-6-18427820" data-stg-id="159"> <div class="min "> <div> <span>FT</span> <span class="ico-alert tt hidden"> <svg class="inc icon-warning"> <use xlink:href="#icon-warning"></use> </svg> <span class="tip" data-type="tooltip">Limited coverage</span> </span> </div> </div> <div class="ply tright name"><span>Crystal Palace</span></div> <div class="sco"> <span class="hom">1</span><span> - </span><span class="awy">0</span> </div> <div class="ply name"><span>AFC Bournemouth</span></div> <div class="star-container" data-type="star-container"> <div class=" " data-type="star"> <svg> <use xlink:href="#icon-star"></use> </svg> </div> </div> </a> (返回单个空格字符)

预期的输出为(例如,随机名称):

find_all('span', class_ = None)(用于CSV文件)

04-12-2019,Chelsea,1,1,Liverpool(用于print()函数)

1 个答案:

答案 0 :(得分:0)

您必须使用硒来允许页面呈现。

from bs4 import BeautifulSoup
from datetime import date, timedelta
from selenium import webdriver

yesterday = date.today() - timedelta(days=1)
checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
url = 'https://www.livescore.com/soccer/' + checkDate
playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'

driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
driver.get(url)

soup = BeautifulSoup(driver.page_source, 'html.parser')

home = soup.find_all('div', class_='ply tright name')
away = soup.find_all('div', class_='ply name')
hScore = soup.find_all('span', class_='hom')
aScore = soup.find_all('span', class_='awy')    


with open('Scores.csv', 'a') as f:
    for h, a, hs, aws in zip(home, away, hScore, aScore):
        f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
        print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)

driver.close()

输出:

03-12-2019 - Crystal Palace 1 - AFC Bournemouth 0
03-12-2019 - Burnley 1 - Manchester City 4
03-12-2019 - Burton Albion 1 - Southend United 1
03-12-2019 - Eastleigh 0 - Wrexham 2
03-12-2019 - Farsley Celtic 1 - Brackley Town 1
03-12-2019 - Hereford 2 - York City  2
03-12-2019 - Kidderminster Harriers 1 - Gateshead 1
03-12-2019 - Leamington 3 - Darlington 0
03-12-2019 - Hungerford Town 1 - Tonbridge Angels 0
03-12-2019 - Brighton & Hove Albion U21 0 - Newport County * 0
03-12-2019 - Colchester United 1 - Stevenage 2
03-12-2019 - Shrewsbury Town 1 - Manchester City Academy * 1
03-12-2019 - Milton Keynes Dons 2 - Coventry City 0
03-12-2019 - Port Vale * 2 - Mansfield Town 2
03-12-2019 - Portsmouth 2 - Northampton Town 1
03-12-2019 - Salford City 3 - Wolverhampton Wanderers Academy 0
03-12-2019 - Walsall 3 - Chelsea U21 2
03-12-2019 - Cremonese 1 - Empoli 0
03-12-2019 - Genoa 3 - Ascoli 2
03-12-2019 - Fiorentina 2 - Cittadella 0
03-12-2019 - Angers 0 - Marseille 2
03-12-2019 - Bordeaux 6 - Nimes 0
03-12-2019 - Brest 5 - Strasbourg 0
03-12-2019 - Lyon 0 - Lille 1
03-12-2019 - Le Havre 2 - Le Mans 0
03-12-2019 - Auxerre 1 - Valenciennes 1
03-12-2019 - Niort 0 - AC Ajaccio 1
03-12-2019 - Troyes 1 - Rodez 0
03-12-2019 - Grenoble 1 - Clermont Foot 1
03-12-2019 - Chateauroux 1 - Sochaux 1
03-12-2019 - Paris FC 0 - Guingamp 3
03-12-2019 - Lens 3 - Chambly 0
03-12-2019 - Orleans 0 - Lorient 4
03-12-2019 - Royal Antwerp * 3 - Genk 3
03-12-2019 - Sporting Covilha 1 - Benfica 1
03-12-2019 - Brora Rangers 1 - Greenock Morton 3
03-12-2019 - Ayr United 0 - Dunfermline Athletic 1
03-12-2019 - Stenhousemuir 2 - Elgin City 2
03-12-2019 - Panetolikos 5 - Ialysos 1
03-12-2019 - Ergotelis 0 - Trikala 1
03-12-2019 - Fatih Karagumruk SK 1 - Goztepe 2
03-12-2019 - Yeni Malatyaspor 3 - Keciorengucu 1
03-12-2019 - Alanyaspor 5 - Adanaspor 1
03-12-2019 - Esenler Erokspor 0 - Sivasspor 2
03-12-2019 - Fenerbahce 4 - Istanbulspor AS 0
03-12-2019 - Cefn Druids AFC 2 - Cardiff Met University 1
03-12-2019 - TNS 1 - Carmarthen 0
03-12-2019 - Glentoran ? - Glenavon ?
03-12-2019 - Legia Warszawa II 0 - Piast Gliwice 2
03-12-2019 - Gornik Leczna 0 - Legia Warszawa 2
03-12-2019 - Sibenik 0 - NK Lokomotiva 4
03-12-2019 - MTK Budapest 0 - Diosgyori VTK 0
03-12-2019 - Szeged-Grosics Akademia 0 - Fehervar FC 1
03-12-2019 - Gaz Metan Medias 1 - FC Voluntari 0
03-12-2019 - CSM Politehnica Iasi 1 - FC FCSB 2
03-12-2019 - Beroe 3 - CSKA 1948 4
03-12-2019 - Slavia Sofia 1 - Botev Plovdiv 2
03-12-2019 - Bnei Yehuda Tel Aviv FC 1 - Hapoel Raanana FC 1
03-12-2019 - Maccabi Netanya FC 1 - Hapoel Ironi Kiryat Shmona 0
03-12-2019 - Hapoel Kfar Saba FC 0 - Hapoel Beer Sheva FC 1
03-12-2019 - Union 1 - Huracan 0
03-12-2019 - Club Atletico Platense 2 - Atlanta 1
03-12-2019 - Club Atletico Mitre 0 - Independiente Rivadavia 0
03-12-2019 - San Martin San Juan 1 - CA Alvarado 1
03-12-2019 - Santamarina 2 - Villa Dalmine 1
03-12-2019 - Atletico Rafaela 2 - Chacarita Juniors 0
03-12-2019 - Quilmes 1 - Brown de Adrogue 1
03-12-2019 - Gimnasia Mendoza 0 - San Martin de Tucuman 3
03-12-2019 - CR Vasco DA Gama RJ 1 - Cruzeiro 0
03-12-2019 - Royal Pari 0 - San Jose 1
03-12-2019 - Luqueno 0 - General Diaz 5
03-12-2019 - CD Motagua 5 - CD Vida 2
03-12-2019 - Laos U23 0 - Thailand U23 2
03-12-2019 - Indonesia U23 8 - Brunei U23 0
03-12-2019 - Singapore U23 0 - Vietnam U23 1
03-12-2019 - Al Riffa ? - Al Hidd ?
03-12-2019 - Al-Najma Manama ? - Busaiteen ?
03-12-2019 - East Riffa ? - Manama Club ?
03-12-2019 - PSS Sleman 5 - Perseru Badak Lampung 1
03-12-2019 - Persib Bandung 0 - Persela Lamongan 2
03-12-2019 - Al Akhdoud 1 - Ohod 2
03-12-2019 - Al-Wehda 1 - Al Khaleej 0
03-12-2019 - FC Masr * 0 - El Gounah 0
03-12-2019 - Al Ahly 3 - Bani Sweef 1
03-12-2019 - AS Slimane ? - Esperance ?
03-12-2019 - Etoile Metlaoui ? - Etoile du Sahel ?
03-12-2019 - __home_team__ __home_score__ - __away_team__ __away_score__