使用Hashie Gem搜索具有组的嵌套哈希

时间:2016-02-23 20:44:09

标签: ruby hash

我有一个这种格式的PDF嵌套哈希:

[ { :page => 1, 
    :lines => [
      { :y => 774.0,
        :text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
      },
      # ...
    ]
  },
  { :page => 2, 
    :lines => [
      { :y => 774.0,
        :text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
      },
      # ...
    ],
    # ...
  }
]

我希望从所有4个页面中获取给定:x的{​​{1}}和:y

我试过了:

:text

这给了我:

require 'hashie'

coordinates.extend(Hashie::Extensions::DeepLocate)
@hash_array = Hash.new
@hash_array = coordinates.deep_locate -> (key, value, object) { key == :text && value == "XXXX" }

但我需要[ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } }, { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" }, { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" }, { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ] :x显示如下:

:y

我将使用这些值进行进一步验证。

1 个答案:

答案 0 :(得分:0)

我不知道你是否会接受一个不使用Hashie的解决方案,但这就是我的方法:

data = [
  { :page => 1, 
    :lines => [
      { :y => 774.0,
        :text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
      },
      # ...
    ]
  },
  { :page => 2, 
    :lines => [
      { :y => 774.0,
        :text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
      },
      # ...
    ],
    # ...
  }
]

SEARCH_TEXT = "XXXX"

coords = data.each_with_object([]) do |page, res|
  page[:lines].each do |line|
    line[:text_groups].each do |group|
      next unless group[:text] == SEARCH_TEXT
      res << { x: group[:x], y: line[:y] }
    end
  end
end

p coords
# => [ { :x => 18.0, :y => 774.0 },
#      { :x => 18.0, :y => 774.0 } ]