我需要解析带有体育赔率的HTML文档。解析完成后,我得到了我需要的数据但是在脚本结尾我得到错误:
line 36, in <module>
print(line[0].text.split(' ',1), line[1].text.split(' ',1))
IndexError: list index out of range
示例HTML:
<div class="Section">
<div id="ctl00_Main_ctl00_ctl78_ctl00_SctDescDiv" class="Header">Natsuho Arakawa v Mandy Wagemaker</div><div id="ctl00_Main_ctl00_ctl78_ctl01_Lines" class="Lines">
<div id="ctl00_Main_ctl00_ctl78_ctl01_ctl00_LineDiv" class="Line"><input type="checkbox" id="77" name="77a" value="pt=N#o=5/2#f=46064591#fp=549590497#so=0#ln=#c=13#" onclick="javascript:stopInPlayRefresh();">3.50 Natsuho Arakawa</div>
<div id="ctl00_Main_ctl00_ctl78_ctl01_ctl01_LineDiv" class="Line"><input type="checkbox" id="78" name="78a" value="pt=N#o=2/7#f=46064591#fp=549590498#so=0#ln=#c=13#" onclick="javascript:stopInPlayRefresh();">1.28 Mandy Wagemaker</div>
我的代码:
for x in soup.find_all('div', ['Lines']):
line = x.find_all('div')
print(line[0].text.split(' ',1), line[1].text.split(' ',1))
结果:
['1.33', 'Elena Vesnina'] ['3.25', 'Bojana Jovanovski']
['2.75', 'Irina-Camelia Begu'] ['1.40', 'Kaia Kanepi']
['2.75', 'Polona Hercog'] ['1.40', 'Lucie Safarova']
['1.44', 'Svetlana Kuznetsova'] ['2.62', 'Maria Teresa Torro-Flor']
一切正常但我不明白错误。怎么了 ?请帮帮我。谢谢
答案 0 :(得分:0)
此代码修改应显示,在某些情况下,您的行不具有2个元素。
for x in soup.find_all('div', ['Lines']):
line = x.find_all('div')
if len(line) => 2:
print(line[0].text.split(' ',1), line[1].text.split(' ',1))
else:
print "line is too short", line