Ruby:如何编写“DRY”/动态/灵活的树状循环结构

时间:2011-11-01 13:11:25

标签: ruby loops dry

我正在寻找在Ruby中完成以下结构/逻辑问题的最佳方法:

网站需要完全抓取,收集每个网页的标题。

可是:

  • 网站的树状结构未知(有多少“级别”,“分支”等)
  • 代码应为“DRY”(=“不要重复自己”)

以下(简化)示例当然是完全愚蠢的:

url = some_root_url
@title_collection = Array.new

go_to_page(url)
@title_collection << find_all_titles_on_page
urls = find_all_urls_on_page

urls.each do |url|
    go_to_page(url)
    @title_collection << find_all_titles_on_page
    urls = find_all_urls_on_page

    urls.each do |url|
        go_to_page(url)
        @title_collection << find_all_titles_on_page
        urls = find_all_urls_on_page

        urls.each do |url|
            go_to_page(url)
            @title_collection << find_all_titles_on_page
            urls = find_all_urls_on_page

            urls.each do |url|
                go_to_page(url)
                @title_collection << find_all_titles_on_page
                urls = find_all_urls_on_page

                urls.each do |url|
                    go_to_page(url)
                    @title_collection << find_all_titles_on_page
                    urls = find_all_urls_on_page

                    [...]
                end
            end
        end
    end
end

那么你将如何以“干”的方式灵活有效地实现这一目标呢?

非常感谢!

汤姆

1 个答案:

答案 0 :(得分:2)

递归是你的朋友:

def walk_tree(url)
  go_to_page(url)
  title_collection << find_all_titles_on_page
  urls = find_all_urls_on_page

  urls.each do |child_url|
    title_collection << walk_tree(child_url)
  end

  title_collection
end