Question

for tag in tags:
    Ulist1.append(tag.get('href', None))

if len(Ulist1) > 2:
    print Ulist1[2]
    html = urllib.urlopen(Ulist1[2]).read()
    soup = BeautifulSoup(html)
    tags = soup('a')
    Ulist2 = list ()

    for tag in tags:
        Ulist2.append(tag.get('href', None))

    if len(Ulist2) > 2:
        print Ulist2[2]
        html = urllib.urlopen(Ulist2[2]).read()
        soup = BeautifulSoup(html)
        tags = soup('a')
        Ulist3 = list ()

        for tag in tags:
            Ulist3.append(tag.get('href', None))

        if len(Ulist3) > 2:
            print Ulist3[2]
            html = urllib.urlopen(Ulist3[2]).read()
            soup = BeautifulSoup(html)
            tags = soup('a')
            Ulist4 = list ()

            for tag in tags:
                Ulist4.append(tag.get('href', None))

这是使用漂亮的汤来解析HTML并找到位置3的链接（名字是1）。请关注该链接。重复此过程4次。有没有更有效的方法来执行此操作而不是使用嵌套循环？

Answer 1

彼得伍德说，你可以把它分解成一个函数。这是一个可能的实现，它显示了基本概念。

def print_third_recursive(tags, iterations):
    Ulist = [tag.get('href', None) for tag in tags] # more pythonic
    if len(Ulist) > 2 && iterations :
        print Ulist[2]
        html = urllib.urlopen(Ulist[2]).read()
        soup = BeautifulSoup(html)
        new_tags = soup('a')
        use_third(new_tags, iterations - 1)

use_third_recursive(tags, 3)

如果你希望函数更简单，那么绝对可以在不使用递归的情况下完成。

def print_third(tags):
    Ulist = [tag.get('href', None) for tag in tags] # more pythonic
    new_tags = []
    if len(Ulist) > 2:
        print Ulist[2]
        html = urllib.urlopen(Ulist[2]).read()
        soup = BeautifulSoup(html)
        new_tags = soup('a')
    return new_tags

print_third(
    print_third(
        print_third(tags)
    )
)

如果其中一个标记列表中没有3个项目，那么这两个实现都不会有任何问题，因为它们只会从图层中返回。

Answer 2

正如彼得和安东尼所说，你可以提取方法并使事情变得更简单。

但是，一般来说，拇指规则，而不是嵌套的ifs，您可以将条件更改为其补充并返回。

在你的例子中

：

if len(Ulist1) > 2: do_stuff() if len(Ulist1) > 2: do_more_stuff()

相反，您可以按如下方式编写：

if len(Ulist1) < 2: # the compement of the original condition return do_stuff() if len(Ulist1) < 2: # the compement of the original condition return do_more_stuff()

因此，您的代码可以编写如下：

if len(Ulist1) < 2:
    return

print Ulist1[2]
html = urllib.urlopen(Ulist1[2]).read()
soup = BeautifulSoup(html)
tags = soup('a')
Ulist2 = list ()

for tag in tags:
    Ulist2.append(tag.get('href', None))

if len(Ulist2) < 2:
    return

print Ulist2[2]
html = urllib.urlopen(Ulist2[2]).read()
soup = BeautifulSoup(html)
tags = soup('a')
Ulist3 = list ()

for tag in tags:
    Ulist3.append(tag.get('href', None))

if len(Ulist3) < 2:
    return

print Ulist3[2]
html = urllib.urlopen(Ulist3[2]).read()
soup = BeautifulSoup(html)
tags = soup('a')
Ulist4 = list ()

for tag in tags:
    Ulist4.append(tag.get('href', None))

当然，我建议你像安东尼上面写的那样提取方法。

希望它有所帮助。

使用嵌套ifs的替代方法

2 个答案: