汤=汤.find_all('tr'):
[<tr data-row="0"><th class="left " csk="Murray,Jamal" data-append-
csv="murraja01" data-stat="player" scope="row"><a
href="/players/m/murraja01.html">Jamal Murray</a></th><td class="right "
csk="2713" data-stat="mp">45:13</td><td class="right " data-
stat="fg">5</td><td class="right " data-stat="fga">12</td><td class="right
" data-stat="fg_pct">.417</td><td class="right " data-stat="fg3">3</td><td
class="right " data-stat="fg3a">6</td><td class="right " data-
stat="fg3_pct">.500</td><td class="right " data-stat="ft">2</td><td
class="right " data-stat="fta">2</td><td class="right " data-
stat="ft_pct">1.000</td><td class="right " data-stat="orb">1</td><td
class="right " data-stat="drb">3</td><td class="right " data-
stat="trb">4</td><td class="right " data-stat="ast">5</td><td class="right
" data-stat="stl">1</td><td class="right " data-stat="blk">1</td><td
class="right " data-stat="tov">5</td><td class="right " data-
stat="pf">1</td><td class="right " data-stat="pts">15</td><td class="right
" data-stat="plus_minus">+6</td></tr>]
[x。汤中的x.text.find_all('tr',{'data-row':0})]:
['Jamal Murray45:13512.41736.500221.0001345115115+6']
预期列表:
['Jamal Murray', '45.13', '5','12','.417','3','6','0.500','2','2','1.000','1','3','4','5','1','1,'5','1','15','+6']
如何在每个 th 标记的每个文本之后添加逗号,以使列表类似于上面的预期列表?
答案 0 :(得分:1)
from bs4 import BeautifulSoup as bs
html = '''<tr data-row="0"><th class="left " csk="Murray,Jamal" data-append-
csv="murraja01" data-stat="player" scope="row"><a
href="/players/m/murraja01.html">Jamal Murray</a></th><td class="right "
csk="2713" data-stat="mp">45:13</td><td class="right " data-
stat="fg">5</td><td class="right " data-stat="fga">12</td><td class="right
" data-stat="fg_pct">.417</td><td class="right " data-stat="fg3">3</td><td
class="right " data-stat="fg3a">6</td><td class="right " data-
stat="fg3_pct">.500</td><td class="right " data-stat="ft">2</td><td
class="right " data-stat="fta">2</td><td class="right " data-
stat="ft_pct">1.000</td><td class="right " data-stat="orb">1</td><td
class="right " data-stat="drb">3</td><td class="right " data-
stat="trb">4</td><td class="right " data-stat="ast">5</td><td class="right
" data-stat="stl">1</td><td class="right " data-stat="blk">1</td><td
class="right " data-stat="tov">5</td><td class="right " data-
stat="pf">1</td><td class="right " data-stat="pts">15</td><td class="right
" data-stat="plus_minus">+6</td></tr>'''
data = []
page = bs(html, 'html.parser')
data.append(page.find('th').text.strip())
for item in page.find_all('td'):
data.append(item.text)
print(data)
Output:
['Jamal Murray', '45:13', '5', '12', '.417', '3', '6', '.500', '2', '2', '1.000', '1', '3', '4', '5', '1', '1', '5', '1', '15', '+6']