我正在尝试从包含多个表的this URL中删除内容。所需的输出是:
NAME FG% FT% 3PM REB AST STL BLK TO PTS SCORE
Team Jackson (0-8) .4313 .7500 21 71 34 11 12 15 189 1-8-0
Team Keyrouze (4-4) .4441 .8090 31 130 71 18 13 45 373 8-1-0
Nutz Vs. Draymond Green (4-4) .4292 .8769 30 86 66 15 9 28 269 3-6-0
Team Pauls 2 da Wall (3-5) .4784 .8438 40 123 64 18 20 30 316 6-3-0
Team Noey (2-6) .4350 .7679 21 125 62 20 9 33 278 7-2-0
YOU REACH, I TEACH (2-5-1) .4810 .7432 20 114 56 30 7 50 277 2-7-0
Kris Kaman His Pants (5-3) .4328 .8000 20 74 59 20 5 27 238 3-6-0
Duke's Balls In Daniels Face (3-4-1) .5000 .7045 42 139 38 27 22 30 303 6-3-0
Knicks Tape (5-3) .5000 .8152 34 143 92 12 9 47 397 4-5-0
Suck MyDirk (5-3) .4734 .8814 29 106 86 22 17 40 435 5-4-0
In Porzingod We Trust (4-4) .4928 .7222 27 180 95 16 16 46 423 7-2-0
Team Aguilar (6-1-1) .4718 .7053 28 177 65 12 35 48 413 2-7-0
Team Li (7-0-1) .4714 .8118 35 134 74 17 17 47 368 6-3-0
Team Iannetta (4-4) .4527 .7302 22 125 90 20 13 44 288 3-6-0
如果像这样格式化表格太困难了,我想知道如何刮掉所有表格?我刮掉所有行的代码是这样的:
tableStats = soup.find('table', {'class': 'tableBody'})
rows = tableStats.findAll('tr')
for row in rows:
print(row.string)
但它只打印值“TEAM”而没有别的......为什么它不包含表中的所有行?
感谢。
答案 0 :(得分:3)
您应该直接使用更可靠的table
查找行,而不是查找class
标记,例如linescoreTeamRow
。这段代码可以解决问题,
from bs4 import BeautifulSoup
import requests
a = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017")
soup = BeautifulSoup(a.text, 'lxml')
# searching for the rows directly
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'})
# you will need to isolate elements in the row for the table
for row in rows:
print row.text
答案 1 :(得分:0)
找到了一种方法来准确获取我在问题中指定的二维矩阵。它存储为列表团队。
代码:
from bs4 import BeautifulSoup
import requests
source_code = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017")
plain_text = source_code.text
soup = BeautifulSoup(plain_text, 'lxml')
teams = []
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'})
# Creates a 2-D matrix.
for row in range(len(rows)):
team_row = []
columns = rows[row].findAll('td')
for column in columns:
team_row.append(column.getText())
print(team_row)
# Add each team to a teams matrix.
teams.append(team_row)
输出:
['Team Jackson (0-10)', '', '.4510', '.8375', '41', '135', '101', '23', '11', '50', '384', '', '5-4-0']
['YOU REACH, I TEACH (3-6-1)', '', '.4684', '.7907', '22', '169', '103', '22', '10', '32', '342', '', '4-5-0']
['Nutz Vs. Draymond Green (4-6)', '', '.4552', '.8372', '30', '157', '68', '15', '16', '39', '356', '', '2-7-0']
["Jesse's Blue Balls (4-5-1)", '', '.4609', '.7576', '47', '158', '71', '30', '20', '38', '333', '', '7-2-0']
['Team Noey (4-6)', '', '.4763', '.8261', '42', '164', '70', '25', '29', '44', '480', '', '5-4-0']
['Suck MyDirk (6-3-1)', '', '.4733', '.8403', '54', '160', '132', '23', '11', '47', '544', '', '4-5-0']
['Kris Kaman His Pants (5-5)', '', '.4569', '.8732', '53', '138', '105', '27', '21', '53', '465', '', '6-3-0']
['Team Aguilar (6-3-1)', '', '.4433', '.7229', '40', '202', '68', '30', '22', '54', '452', '', '3-6-0']
['Knicks Tape (6-3-1)', '', '.4406', '.8824', '52', '172', '108', '24', '13', '49', '513', '', '6-3-0']
['Team Iannetta (4-6)', '', '.5321', '.6923', '24', '146', '94', '32', '16', '60', '428', '', '3-6-0']
['In Porzingod We Trust (6-4)', '', '.4694', '.6364', '37', '216', '133', '31', '21', '77', '468', '', '4-5-0']
['Team Keyrouze (6-4)', '', '.4705', '.8854', '51', '135', '108', '25', '17', '43', '550', '', '5-4-0']
['Team Li (8-1-1)', '', '.4369', '.8182', '57', '203', '130', '34', '22', '54', '525', '', '6-3-0']
['Team Pauls 2 da Wall (5-5)', '', '.4780', '.5970', '27', '141', '47', '19', '25', '28', '263', '', '3-6-0']