用漂亮的汤提取其他数据

时间:2020-10-18 17:31:44

标签: python-3.x web-scraping

我创建了一个简单的脚本,用于查找在游戏中有预订的玩家。我需要走得更远,并创建两个列表(主队和客队),其中包含球员姓名,预订颜色和时间。

import requests
from bs4 import BeautifulSoup

import warnings
warnings.simplefilter(action='ignore')

url = 'https://www.fcf.cat/acta/2021/futbol-11/preferent-infantil/grup-1/pi/atletic-sant-just-f-c-a/pi/barcelona-fc-b'


soup = BeautifulSoup(requests.get(url, verify=False).text, 'html.parser')

targeta_g= soup.find_all(class_="groga-s")
targeta_v= soup.find_all(class_="vermella-s")

print (targeta_g)
print (targeta_v)

谢谢

1 个答案:

答案 0 :(得分:0)

<uses-permission android:name="android.permission.KILL_BACKGROUND_PROCESSES" />

打印:

import requests
from bs4 import BeautifulSoup


def get_players(column):
    players = []
    for table in column.select('table:has(th:contains("Targetes"))'):
        for row in table.select('tr:has(td)'):
            tds = [td.get_text(strip=True) for td in row.select('td')]
            players.append([row.span.text, *tds[1:], 'Yellow' if row.select_one('.groga-s') else 'Red'])
    return players


url = 'https://www.fcf.cat/acta/2021/futbol-11/preferent-infantil/grup-1/pi/atletic-sant-just-f-c-a/pi/barcelona-fc-b'
soup = BeautifulSoup(requests.get(url, verify=False).content, 'html.parser')

main_columns = soup.select('.col-md-4.p-0_ml')
players = {'Team Home': get_players(main_columns[0]), 'Team Away': get_players(main_columns[2])}
print(players)

对于其中一个团队尚未收到卡牌的比赛,例如{'Team Home': [['17', 'KOLOMIETS , FYODOR', "22'", 'Red'], ['18', 'RUGGIERO , ANTONIO', "60'", 'Yellow']], 'Team Away': [['11', 'SO DELGADO PINTO, SIDNEY JOSE', "64'", 'Yellow']]} 它会打印:

url = 'https://www.fcf.cat/acta/2021/futbol-11/preferent-infantil/grup-1/pi/castelldefels-ue-a/pi/rapitenca-ue-b'