Python继续没有正确循环

时间:2013-10-25 16:56:11

标签: python

我正在使用BeautifulSoup,并且我继续在循环中继续不正确。所以我删除了continue,然后我的print语句出现了无效的语法错误。我正在运行BS4和Python 2.7.5所有帮助非常感谢。这是我的代码。

from bs4 import BeautifulSoup

soup = BeautifulSoup (open("43rd-congress.html"))

final_link = soup.p.a
final_link.decompose()

trs = soup.find_all('tr')

for tr in trs:
for link in tr.find_all('a'):
    fulllink = link.get('href')
    print fulllink #print in terminal to verify results

tds = tr.find_all("td")


try: #we are using "try" because the table is not well formatted. 
   names = str(tds[0].get_text()) 
   years = str(tds[1].get_text())
   positions = str(tds[2].get_text())
   parties = str(tds[3].get_text())
   states = str(tds[4].get_text())
   congress = tds[5].get_text()

except:
  print "bad tr string"
  continue 

print names, years, positions, parties, states, congress

2 个答案:

答案 0 :(得分:1)

由于您似乎有错误,我相信您的文件中可能确实存在错误的缩进。您的代码可能看起来像这样:

from bs4 import BeautifulSoup

soup = BeautifulSoup (open("43rd-congress.html"))

final_link = soup.p.a
final_link.decompose()

trs = soup.find_all('tr')

for tr in trs:

    for link in tr.find_all('a'):
        fulllink = link.get('href')
        print fulllink #print in terminal to verify results

    tds = tr.find_all("td")


    try: #we are using "try" because the table is not well formatted. 
       names = str(tds[0].get_text()) 
       years = str(tds[1].get_text())
       positions = str(tds[2].get_text())
       parties = str(tds[3].get_text())
       states = str(tds[4].get_text())
       congress = tds[5].get_text()

       print names, years, positions, parties, states, congress

    except exc:
      print "bad tr string"

在python中,每个代码块都应使用tabs / space与缩进嵌套。混合它们并不好。

在你的代码中,你有一个第一个for循环,它将遍历所有tr和第二个打印所有URL。

但是你忘了缩进应该在for循环中的第一个块。

修改

此外,您不必在您的情况下使用继续。检查我的代码编辑。

答案 1 :(得分:0)

缩进打印/继续查看。如果它关闭,则except:看起来像是空的,我不确定Python是否满意。

尝试注释掉与try / except无关的所有内容,看看它是否仍然会出错。