问题：

Question

我有一个我解析的yaml文件。我使用一个简单的递归函数，但不能按我的预期工作。如果我致电parse(content, 'mydrive/home/sample/aaaaa1.html')，我会得到结果home/sample/aaaaa1.html。但是，parse(content, 'mydrive/home/sample/sample3.html')会返回None。

我做错了什么？

from ruamel.yaml import YAML as yaml

content = yaml().load(open(r'/home/doc/sample.yaml', 'r'))

def parse(content, path):
    """
    Parse the YAML.
    """
    for i in content:
        if isinstance(i, dict):
            for item in i:
                if item == 'href':
                    if i[item] in path:
                        return i[item]
                elif item == 'topics':
                    return parse(i[item], path)
                elif item == 'placeholder':
                    pass
                else:
                    print("I did not recognize", item)
        else:
            print("---- not a dictionary ----")

以下是样本yaml：

- placeholder: Sample
- topics:

    - placeholder: Sample
    - topics:
        - placeholder: Sample
        - topics:
            - href: home/sample/aaaaa1.html
            - href: home/sample/aaaaa2.html

    - placeholder: Sample
    # Comment
    - topics:     
        - href: home/sample/sample1.html
        - href: home/sample/sample2.html
        - href: home/sample/sample3.html

Answer 1

问题：

您的parse()函数永远不会到达树中主题的最后一个分支。特别是当它遇到这一行时：

elif item.startswith('topics'):
    return parse(i[item], path)

它只会将进一步潜入内层，但不知道如何退出，因为你总是返回内部项目的parse()。要演示，如果您在以下位置添加此else行：

if item == 'href':
    if i[item] in path:
        return i[item]
    else:  #Add this
        return "I can't get out" #Add this

你会意识到你的第二次呼叫sample3.html正在返回“我无法离开”，因为这是你的回归链的终点。如果项目与路径不匹配，它现在返回None没有else。

修复：

一个简单的解决方法是更改topics处理，如下所示：

elif item == 'topics':
    result = parse(i[item], path)
    if result: return result

因此，您始终要检查内部parse()是否返回了某些内容。如果没有，则不返回并继续使用外层上的下一个项目。

输出：

home/sample/aaaaa1.html
home/sample/sample3.html

我的2美分：

我通过向您的主题/占位符添加1/2/3来调试此操作，并按照调试器查看迭代停止的位置。它有助于可视化问题。恕我直言（仍然是初学者）更简洁的方法是在每次检查中分配一个返回值，但只返回函数最后的值，以避免这种调试混乱。顺便说一下，谢谢你提出这个问题。这对我来说也是一个学习过程，我学会了递归函数的注意事项。这就是我编写parse()的方式：

def parse(content, path):
    """
    Parse the YAML.
    """
    for i in content:
        result = None  # Default result to None so the return won't trigger.
        if isinstance(i, dict):
            for item in i:
                if item == 'href':
                    if i[item] in path:
                        result = i[item]  # Assign value instead of returning
                elif item == 'topics':
                    result = parse(i[item], path)  # Assign value instead of returning
                elif item == 'placeholder':
                    pass
                else:
                    print("I did not recognize", item)
        else:
            print("---- not a dictionary ----") 
        if result: return result    # only return something if it has found a match.

我还会更新两个print()语句来实际处理条件，如果它对你的程序有意义的话。除非您正在调试或想监视您的控制台，否则我发现打印几乎没用。我要么记录它们，要么将某些东西返回给你的程序，这样条件就不会被忽视了。

在Yaml解析中不能进行更深入的递归

1 个答案:

问题：

修复：

输出：

我的2美分：