Question

import glob
from bs4 import BeautifulSoup
f = open('csvfile.csv','w')
for file in glob.glob('*.htm'):
    print 'Processing', file
    for y in range(0,3):          
        for x in range(0, 6): 
            soup = BeautifulSoup(open(file).read())
            all_string=soup.find_all("h2")[x].get_text()
            #stack=[]
            #acct.write(", ".join(stack) + '\n')
            f.write(all_string) 
            f.write('\n')
            print(all_string)
    x=0  
f.close()

输出 -

通过N2衍生的铁氮化物和酰亚胺处理碱控制的C-H裂解或N-C键形成 - 美国化学学会杂志（ACS出版社）.htm
摘要
支持信息
钒催化的环境反应条件下分子二氮还原成甲硅烷基胺

追踪（最近一次呼叫最后一次）：

文件＆＃34;＆＃34;，第1行，in       runfile（＆＃39; /Users/ROXX/Desktop/project/csv1.py' ;, wdir =＆＃39; / Users / ROXX / Desktop / project＆＃39;）

文件   ＆＃34; /Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py" ;,   第880行，在runfile中       execfile（filename，namespace）

文件   ＆＃34; /Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py" ;,   第94行，在execfile中       builtins.execfile（filename，* where）

File＆＃34; /Users/ROXX/Desktop/project/csv1.py" ;,第17行，在       all_string = soup.find_all（＆＃34; H 2＆＃34;）[X] .get_text（）

IndexError：列表索引超出范围

Answer 1

错误的原因可能是您正在处理的文件中出现少于7次的h2（在第二个循环中）。重复“all_string”而不是固定的间隔可能会解决问题。

IndexError：列表索引超出范围未获得解决

1 个答案: