Question

我有一个这样的文件：

$urls = [
      {name:'Venture Capitals',
       sites: [
           'http://blog.ycombinator.com/posts.atom',
           'http://themacro.com/feed.xml',
           'http://a16z.com/feed/',
           'http://firstround.com/review/feed.xml',
           'http://www.kpcb.com/blog.rss',
           'https://library.gv.com/feed',
           'http://theaccelblog.squarespace.com/blog?format=RSS',
           'https://medium.com/feed/accel-insights',
           'http://500.co/blog/posts/feed/',
           'http://feeds.feedburner.com/upfrontinsights?format=xml',
           'http://versionone.vc/feed/',
           'http://nextviewventures.com/blog/feed/',
       ]},

      {name:'Companies and Groups',
       sites: [
           {name:'Product Companies',
            sites: [
              'https://m.signalvnoise.com/feed',
              'http://feeds.feedburner.com/insideintercom',
              'http://www.kickstarter.com/blog.atom',
              'http://blog.invisionapp.com/feed/',
              'http://feeds.feedburner.com/bufferapp',
              'https://open.buffer.com/feed/',
              'https://blog.asana.com/feed/',
              'http://blog.drift.com/rss.xml',
              'https://www.groovehq.com/blog/feed',]},
           {name:'Consulting Groups, Studios',
            sites: [
              'http://svpg.com/articles/rss',
              'http://www.thoughtworks.com/rss/insights.xml',
              'http://zurb.com/blog/rss',]},
           {name:'Communities',
            sites: [
              'http://alistapart.com/main/feed',
              'https://www.mindtheproduct.com/feed/',]},
       ]},


  ]

我已将$url组织成不同的群组。现在我想提取所有网址（sites中的链接），我该怎么办？

主要问题是，sites中有sites，如上文所示。

我的问题是：

我是否使用适当的文件结构来保存这些链接？（数组中的数组）。如果不是，保存和分组它们的好方法是什么？
如何将所有网址提取到扁平数组中？所以我稍后可以遍历列表。

我可以手动执行此操作，如下所示的代码。

 sites = []
  $urls.each do |item|
    item[:sites].each do |sub_item|
      if sub_item.is_a?(Hash)
        sites.concat sub_item[:sites]
      else
        sites.append sub_item
      end
    end
  end

  File.open('lib/flatten_sites.yaml', 'w') { |fo| fo.puts sites.to_yaml }

但我觉得这是糟糕的代码。

在这种特定情况下的另一种选择是收集所有sites属性，但我觉得这也是非常有限的，在其他一些情况下可能没有帮助。

Answer 1

如果您有Hash，则可以使用此递归方法

<强>输入

urls = [
  {
    :name => 'Venture Capitals',
    :sites => [
      'http://blog.ycombinator.com/posts.atom',
      'http://themacro.com/feed.xml',
      'http://a16z.com/feed/',
      'http://firstround.com/review/feed.xml',
      'http://www.kpcb.com/blog.rss',
      'https://library.gv.com/feed',
      'http://theaccelblog.squarespace.com/blog?format=RSS',
      'https://medium.com/feed/accel-insights',
      'http://500.co/blog/posts/feed/',
      'http://feeds.feedburner.com/upfrontinsights?format=xml',
      'http://versionone.vc/feed/',
      'http://nextviewventures.com/blog/feed/',
    ]
  },
  {
    :name => 'Companies and Groups',
    :sites => [
      {
        :name => 'Product Companies',
        :sites => [
          'https://m.signalvnoise.com/feed',
          'http://feeds.feedburner.com/insideintercom',
          'http://www.kickstarter.com/blog.atom',
          'http://blog.invisionapp.com/feed/',
          'http://feeds.feedburner.com/bufferapp',
          'https://open.buffer.com/feed/',
          'https://blog.asana.com/feed/',
          'http://blog.drift.com/rss.xml',
          'https://www.groovehq.com/blog/feed',]
      },
      {
        :name => 'Consulting Groups, Studios',
        :sites => [
          'http://svpg.com/articles/rss',
          'http://www.thoughtworks.com/rss/insights.xml',
          'http://zurb.com/blog/rss',]
      },
      {
        :name => 'Communities',
        :sites => [
          'http://alistapart.com/main/feed',
          'https://www.mindtheproduct.com/feed/',]
      }
    ]
  }
]

方式

def get_all_sites(data) data[:sites].map { |r| Hash === r ? get_all_sites(r) : r } end urls.map { |r| get_all_sites(r) }.flatten

<强>输出

[ "http://blog.ycombinator.com/posts.atom", "http://themacro.com/feed.xml", "http://a16z.com/feed/", "http://firstround.com/review/feed.xml", "http://www.kpcb.com/blog.rss", "https://library.gv.com/feed", "http://theaccelblog.squarespace.com/blog?format=RSS", "https://medium.com/feed/accel-insights", "http://500.co/blog/posts/feed/", "http://feeds.feedburner.com/upfrontinsights?format=xml", "http://versionone.vc/feed/", "http://nextviewventures.com/blog/feed/", "https://m.signalvnoise.com/feed", "http://feeds.feedburner.com/insideintercom", "http://www.kickstarter.com/blog.atom", "http://blog.invisionapp.com/feed/", "http://feeds.feedburner.com/bufferapp", "https://open.buffer.com/feed/", "https://blog.asana.com/feed/", "http://blog.drift.com/rss.xml", "https://www.groovehq.com/blog/feed", "http://svpg.com/articles/rss", "http://www.thoughtworks.com/rss/insights.xml", "http://zurb.com/blog/rss", "http://alistapart.com/main/feed", "https://www.mindtheproduct.com/feed/" ]

我希望这会有所帮助

Answer 2

类似于Lukas Baliak提出的解决方案，但使用更合适的Proc代替冗余method（适用于任何级别的嵌套）：

deep_map = ->(data) do 
  data[:sites].flat_map { |r| r.is_a?(String) ? r : deep_map.(r) }
end
urls.flat_map(&deep_map)

Ruby：从嵌套哈希中将目标键的值收集到数组中

2 个答案: