如何使用python抓取esoccer结果

时间:2020-05-11 18:48:20

标签: python web-scraping

我正在尝试抓取以下站点:https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play,但是我遇到了一些困难,因为同一行中有两个“ a”标签,但是没有不同的属性名称。我编写了这段代码,但是结果却是一个高于另一个,并且我需要它们在同一行上。我正在使用3.7版本

url = "https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play"
response = rq.get(url)
soup = bs4.BeautifulSoup(page_html, 'html.parser')
tables = soup.find_all(attrs={"class":re.compile(r"table table-sm")})

df_dict = dict()

for i in tables[0].find_all(attrs={"class":(r"dt_n")}):
    for rows in i:
        df_dict["Data"] = rows 
        print(df_dict)

for i in tables:
    for rows in i.find_all('a'):
        print(rows)

enter image description here

1 个答案:

答案 0 :(得分:1)

import requests
from bs4 import BeautifulSoup

url = "https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play"

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

all_data = []
for tr in soup.table.select('tr'):
    row = [td.get_text(strip=True, separator='|').split('|') for td in tr.select('td')]
    if not row:
        continue
    all_data.append([data for sublist in row for data in sublist if data not in ('v', '-', )])

#pretty print the data to screen:
for row in all_data:
    print('{:<15}{:<35}{:<35}{:<5}'.format(*row))

打印:

05/11 18:20    MatheusTracz99 Esports             MLobaoJr (NSE) Esports             1-0  
05/11 18:00    MLobaoJr (NSE) Esports             MatheusTracz99 Esports             0-0  
05/11 17:40    iDantee (MGCF) Esports             Klinger (R10) Esports              0-3  
05/11 17:40    Gabrielpn (R10) Esports            Brenner (Bundled) Esports          2-3  
05/11 17:20    Brenner (Bundled) Esports          Gabrielpn (R10) Esports            1-1  
05/11 17:20    Klinger (R10) Esports              iDantee (MGCF) Esports             1-2  
05/11 17:00    Nunes21 (STRM) Esports             xPHzin (R10) Esports               1-1  
05/11 17:00    Vitor (WLB) Esports                STRM Solo (G10) Esports            3-1  
05/11 16:40    xPHzin (R10) Esports               Nunes21 (STRM) Esports             3-1  
05/11 16:40    STRM Solo (G10) Esports            Vitor (WLB) Esports                1-2  
05/11 16:20    MatheusTracz99 Esports             Soares (STRM) Esports              5-0  
05/11 16:20    MLobaoJr (NSE) Esports             Gabriel (STRM) Esports             4-0  
05/11 16:00    Soares (STRM) Esports              MatheusTracz99 Esports             2-4  
05/11 16:00    Gabriel (STRM) Esports             MLobaoJr (NSE) Esports             1-1  
05/11 15:40    Brenner (Bundled) Esports          Rampazzo (STRM) Esports            3-3  
05/11 15:40    Pedro7 (SMAYS) Esports             Gabrielpn (R10) Esports            1-2  
05/11 15:20    Rampazzo (STRM) Esports            Brenner (Bundled) Esports          2-4  
05/11 15:20    Gabrielpn (R10) Esports            Pedro7 (SMAYS) Esports             3-1  
05/11 15:00    Bezerra (STRM) Esports             Klinger (R10) Esports              1-2  
05/11 15:00    BitFrank16 (INTZ) Esports          iDantee (MGCF) Esports             1-3  
05/11 14:40    Klinger (R10) Esports              Bezerra (STRM) Esports             2-0  
05/11 14:40    iDantee (MGCF) Esports             BitFrank16 (INTZ) Esports          1-0  
05/11 02:30    Abrucio (R10) Esports              xVega (R10) Esports                0-3  
05/11 02:30    CarlosH10 (RDT) Esports            OdloT_Br Esports                   7-8  
05/11 02:10    xVega (R10) Esports                Abrucio (R10) Esports              2-1  
05/11 02:10    OdloT_Br Esports                   CarlosH10 (RDT) Esports            0-2  
05/11 01:30    OdloT_Br Esports                   xVega (R10) Esports                1-3  
05/11 01:30    CarlosH10 (RDT) Esports            Abrucio (R10) Esports              2-2  
05/11 01:10    xVega (R10) Esports                OdloT_Br Esports                   1-2  
05/11 01:10    Abrucio (R10) Esports              CarlosH10 (RDT) Esports            5-1