我正在尝试抓取以下站点:https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play,但是我遇到了一些困难,因为同一行中有两个“ a”标签,但是没有不同的属性名称。我编写了这段代码,但是结果却是一个高于另一个,并且我需要它们在同一行上。我正在使用3.7版本
url = "https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play"
response = rq.get(url)
soup = bs4.BeautifulSoup(page_html, 'html.parser')
tables = soup.find_all(attrs={"class":re.compile(r"table table-sm")})
df_dict = dict()
for i in tables[0].find_all(attrs={"class":(r"dt_n")}):
for rows in i:
df_dict["Data"] = rows
print(df_dict)
for i in tables:
for rows in i.find_all('a'):
print(rows)
答案 0 :(得分:1)
import requests
from bs4 import BeautifulSoup
url = "https://bsportsfan.com/le/22724/Esoccer-Pro-Player-Cup--12-mins-play"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = []
for tr in soup.table.select('tr'):
row = [td.get_text(strip=True, separator='|').split('|') for td in tr.select('td')]
if not row:
continue
all_data.append([data for sublist in row for data in sublist if data not in ('v', '-', )])
#pretty print the data to screen:
for row in all_data:
print('{:<15}{:<35}{:<35}{:<5}'.format(*row))
打印:
05/11 18:20 MatheusTracz99 Esports MLobaoJr (NSE) Esports 1-0
05/11 18:00 MLobaoJr (NSE) Esports MatheusTracz99 Esports 0-0
05/11 17:40 iDantee (MGCF) Esports Klinger (R10) Esports 0-3
05/11 17:40 Gabrielpn (R10) Esports Brenner (Bundled) Esports 2-3
05/11 17:20 Brenner (Bundled) Esports Gabrielpn (R10) Esports 1-1
05/11 17:20 Klinger (R10) Esports iDantee (MGCF) Esports 1-2
05/11 17:00 Nunes21 (STRM) Esports xPHzin (R10) Esports 1-1
05/11 17:00 Vitor (WLB) Esports STRM Solo (G10) Esports 3-1
05/11 16:40 xPHzin (R10) Esports Nunes21 (STRM) Esports 3-1
05/11 16:40 STRM Solo (G10) Esports Vitor (WLB) Esports 1-2
05/11 16:20 MatheusTracz99 Esports Soares (STRM) Esports 5-0
05/11 16:20 MLobaoJr (NSE) Esports Gabriel (STRM) Esports 4-0
05/11 16:00 Soares (STRM) Esports MatheusTracz99 Esports 2-4
05/11 16:00 Gabriel (STRM) Esports MLobaoJr (NSE) Esports 1-1
05/11 15:40 Brenner (Bundled) Esports Rampazzo (STRM) Esports 3-3
05/11 15:40 Pedro7 (SMAYS) Esports Gabrielpn (R10) Esports 1-2
05/11 15:20 Rampazzo (STRM) Esports Brenner (Bundled) Esports 2-4
05/11 15:20 Gabrielpn (R10) Esports Pedro7 (SMAYS) Esports 3-1
05/11 15:00 Bezerra (STRM) Esports Klinger (R10) Esports 1-2
05/11 15:00 BitFrank16 (INTZ) Esports iDantee (MGCF) Esports 1-3
05/11 14:40 Klinger (R10) Esports Bezerra (STRM) Esports 2-0
05/11 14:40 iDantee (MGCF) Esports BitFrank16 (INTZ) Esports 1-0
05/11 02:30 Abrucio (R10) Esports xVega (R10) Esports 0-3
05/11 02:30 CarlosH10 (RDT) Esports OdloT_Br Esports 7-8
05/11 02:10 xVega (R10) Esports Abrucio (R10) Esports 2-1
05/11 02:10 OdloT_Br Esports CarlosH10 (RDT) Esports 0-2
05/11 01:30 OdloT_Br Esports xVega (R10) Esports 1-3
05/11 01:30 CarlosH10 (RDT) Esports Abrucio (R10) Esports 2-2
05/11 01:10 xVega (R10) Esports OdloT_Br Esports 1-2
05/11 01:10 Abrucio (R10) Esports CarlosH10 (RDT) Esports 5-1