如何使用BeautifulSoup仅提取某些字段

时间:2019-02-02 14:56:39

标签: python python-3.x beautifulsoup

我正在尝试打印所有包含英格兰的字段,当前代码已为我将所有国籍打印到txt文件中,但是我只想打印英格兰字段。我从中拉出的页面是https://www.premierleague.com/players

import requests
from bs4 import BeautifulSoup

r=requests.get("https://www.premierleague.com/players")
c=r.content
soup=BeautifulSoup(c, "html.parser")
players = open("playerslist.txt", "w+")


for playerCountry in soup.findAll("span", {"class":"playerCountry"}):
    players.write(playerCountry.text.strip())
    players.write("\n")

2 个答案:

答案 0 :(得分:1)

只需检查它是否不等于“英格兰”,如果是,则跳到列表中的下一项:

mail()

答案 1 :(得分:1)

或者,您可以只使用pandas.read_html()和几行代码:

import pandas as pd

df = pd.read_html("https://www.premierleague.com/players")[0]
print(df.loc[df['Nationality'] != 'England'])

打印:

               Player    Position                       Nationality
2        Charlie Adam  Midfielder                          Scotland
3              Adrián  Goalkeeper                             Spain
4        Adrien Silva  Midfielder                          Portugal
5     Ibrahim Afellay  Midfielder                       Netherlands
6         Benik Afobe     Forward  The Democratic Republic Of Congo
7       Sergio Agüero     Forward                         Argentina
9    Soufyan Ahannach  Midfielder                       Netherlands
10       Ahmed Hegazi    Defender                             Egypt
11         Nathan Aké    Defender                       Netherlands
14  Toby Alderweireld    Defender                           Belgium
15       Aleix García  Midfielder                             Spain
17           Ali Gabr    Defender                             Egypt
18         Allan Nyom    Defender                          Cameroon
19        Allan Souza  Midfielder                            Brazil
20          Joe Allen  Midfielder                             Wales
22      Marcos Alonso    Defender                             Spain
23        Paulo Alves  Midfielder                          Portugal
24     Daniel Amartey  Midfielder                             Ghana
25         Jordi Amat    Defender                             Spain
27       Ethan Ampadu    Defender                             Wales
28     Nordin Amrabat     Forward                           Morocco