逃离Python中的嵌套循环

时间:2015-03-09 21:01:21

标签: python

我在Python 2.7x中运行以下代码:

def captureAlbumLinks():
    for page in index_pages: # index_pages is a list of URLs
        resp = connect_tor(page)
        soup = BeautifulSoup(resp.read(), from_encoding=resp.info().getparam('charset'))
        try:
            # The below two variables relate to URLs of type string
            x = pickle.load(open("last_passworded_album.p", "rb"))
            y = pickle.load(open("last_accessible_album.p", "rb"))
        except:
            print "There is no pickle file"
        for a in soup.find_all('a', href=True):
            if (root_url + a['href']) == x or (root_url) + a['href'] == y:
                break
            elif "passchk.php" in a['href']:
                passworded_albums.append(root_url + a['href'])
            elif "search.php" in a['href'] or "switch.php" in a['href']:
                pass
            else:
                if ".html" in a['href']:
                    accessible_albums.append(root_url + a['href'])

本质上,“if(root_url + a ['href'])== x或(root_url)+ a ['href'] == y:”,我不希望任何elifs运行想摆脱' for '循环。尽管如此,即使if语句运行为true(使用print语句验证),我的代码似乎也会运行到下一个'elif'。我想目前我只是打破'if'循环而不是'for'循环。

我觉得这是一个缩进问题,但尝试过移动'休息'但没有快乐。

有人可以帮忙吗?

4 个答案:

答案 0 :(得分:1)

有时候实用的方法是将它包装在一个函数中并返回。在你的情况下,你可以"返回"但一般来说,你会写一个内部函数(在这种情况下'循环')

def captureAlbumLinks():
  def loops():
    for page in index_pages: # index_pages is a list of URLs
        resp = connect_tor(page)
        soup = BeautifulSoup(resp.read(), from_encoding=resp.info().getparam('charset'))
        try:
            # The below two variables relate to URLs of type string
            x = pickle.load(open("last_passworded_album.p", "rb"))
            y = pickle.load(open("last_accessible_album.p", "rb"))
        except:
            print "There is no pickle file"
        for a in soup.find_all('a', href=True):
            if (root_url + a['href']) == x or (root_url) + a['href'] == y:
                return
            elif "passchk.php" in a['href']:
                passworded_albums.append(root_url + a['href'])
            elif "search.php" in a['href'] or "switch.php" in a['href']:
                pass
            else:
                if ".html" in a['href']:
                    accessible_albums.append(root_url + a['href'])
  return loops()

答案 1 :(得分:1)

您只需返回结束功能:

import pickle
def capture_album_links():
    for page in index_pages: # index_pages is a list of URLs
        resp = connect_tor(page)
        soup = BeautifulSoup(resp.read(), from_encoding=resp.info().getparam('charset'))
        try:
            # with will automatically close your files
            with open("last_passworded_album.p", "rb") as f1, open("last_accessible_album.p", "rb") as f2:
                x = pickle.load(f1)
                y = pickle.load(f2)
        # catch specific errors
        except (pickle.UnpicklingError,IOError) as e:
            print(e)
            print "There is no pickle file"
            # continue on error or x and y won't be defined 
            continue 
        for a in soup.find_all('a', href=True):
            if root_url + a['href'] in {x, y}:
                return # just return to end both loops
            elif "passchk.php" in a['href']:
                passworded_albums.append(root_url + a['href'])
            elif "search.php" in a['href'] or "switch.php" in a['href']:
                continue 
            else:
                if ".html" in a['href']:
                    accessible_albums.append(root_url + a['href'])

答案 2 :(得分:1)

我喜欢将重构转换为函数并返回。您也可以将代码放在try / except块中,并在想要断开所有循环时引发异常。

答案 3 :(得分:0)

编辑:抑制无用的评论。添加备选方案:

除了使用return之外,如果你想在循环之后做某事,你可以使用异常:

class MyException(Exception):
    pass

def captureAlbumLinks():
    try:
        for page in index_pages: # index_pages is a list of URLs
            resp = connect_tor(page)
            soup = BeautifulSoup(resp.read(), from_encoding=resp.info().getparam('charset'))
            try:
                # The below two variables relate to URLs of type string
                x = pickle.load(open("last_passworded_album.p", "rb"))
                y = pickle.load(open("last_accessible_album.p", "rb"))
            except:
                print "There is no pickle file"
            for a in soup.find_all('a', href=True):
                if (root_url + a['href']) == x or (root_url + a['href']) == y:
                    raise MyException()
                elif "passchk.php" in a['href']:
                    passworded_albums.append(root_url + a['href'])
                elif "search.php" in a['href'] or "switch.php" in a['href']:
                    pass
                else:
                    if ".html" in a['href']:
                        accessible_albums.append(root_url + a['href'])
    except MyException as e:
        pass

另一种可能不太直观的方法是在else循环中使用for子句,仅当for正常停止时才会执行(而不是break def captureAlbumLinks(): for page in index_pages: # index_pages is a list of URLs resp = connect_tor(page) soup = BeautifulSoup(resp.read(), from_encoding=resp.info().getparam('charset')) try: # The below two variables relate to URLs of type string x = pickle.load(open("last_passworded_album.p", "rb")) y = pickle.load(open("last_accessible_album.p", "rb")) except: print "There is no pickle file" for a in soup.find_all('a', href=True): if (root_url + a['href']) == x or (root_url + a['href']) == y: break elif "passchk.php" in a['href']: passworded_albums.append(root_url + a['href']) elif "search.php" in a['href'] or "switch.php" in a['href']: pass else: if ".html" in a['href']: accessible_albums.append(root_url + a['href']) else: continue break 1}}出来的):

{{1}}