Ruby - 基于数组顺序对哈希值(字符串)进行排序

时间:2016-10-19 04:05:33

标签: arrays ruby

我有一个以下所示格式的哈希数组,我试图根据一个单独的数组对哈希的:book键进行排序。订单不是按字母顺序排列的,对于我的用例,它不能按字母顺序排列。

我需要根据以下数组进行排序:

array = ['Matthew', 'Mark', 'Acts', '1John']

请注意,我已经看到了一些利用Array#index(例如Sorting an Array of hashes based on an Array of sorted values)来执行自定义排序的解决方案,但这不适用于字符串。

我尝试了与Array#sortArray#sort_by进行排序的各种组合,但他们似乎不接受自定义订单。我错过了什么?提前感谢您的帮助!

哈希数组

[{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]

5 个答案:

答案 0 :(得分:5)

这是一个例子

arr = [{a: 1}, {a: 3}, {a: 2}] 

order = [2,1,3]  

arr.sort { |a,b| order.index(a[:a]) <=> order.index(b[:a]) }                                           
# => [{:a=>2}, {:a=>1}, {:a=>3}]  

在你的情况下,它将是

order = ['Matthew', 'Mark', 'Acts', '1John']
result = list_of_hashes.sort do |a,b|
  order.index(a[:name]) <=> order.index(b[:name])
end

这里有两个重要的概念:

  1. 使用Array#index查找数组中找到元素的位置
  2. '太空飞船运营商'<=> Array#sort的工作原理 - 请参阅What is the Ruby <=> (spaceship) operator?
  3. 您可以通过索引要订购的元素列表来使其快一点:

    order_with_index = order.each.with_object.with_index({}) do |(elem, memo), idx|
      memo[elem] = idx
    end
    

    然后代替order.index(<name>)使用order_with_index[<name>]

答案 1 :(得分:3)

由于您知道所需的顺序,因此无需对数组进行排序。这是你可以做到的一种方式。 (我把你的哈希数组称为bible。)

bible.group_by { |h| h[:book] }.values_at(*array).flatten
  #=> [{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #    {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"},
  #    {:book=>"Mark", :chapter=>"6", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"17", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"}] 

由于Enumerable#group_byHash#values_atArray#flatten每个只需要通过数组bible,因此这可能比bible较大时排序要快。

以下是步骤。

h = bible.group_by { |h| h[:book] }
  #=> {"Matthew"=>[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #                {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    "Mark"   =>[{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    "1John"  =>[{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #                {:book=>"1John", :chapter=>"1", :section=>"new_testament"}],
  #    "Acts"   =>[{:book=>"Acts", :chapter=>"9", :section=>"new_testament"}, 
  #                {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}]
  #   } 

a = h.values_at(*array)
  #=> h.values_at('Matthew', 'Mark', 'Acts', '1John')
  #=> [[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #     {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    [{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    [{:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #     {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}],
  #    [{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #     {:book=>"1John", :chapter=>"1", :section=>"new_testament"}]] 

最后,a.flatten返回前面显示的数组。

让我们做一个基准。

require 'fruity'

@bible = [
  {:book=>"Matthew",
   :chapter=>"4",
   :section=>"new_testament"},
  {:book=>"Matthew",
   :chapter=>"22",
   :section=>"new_testament"},
  {:book=>"Mark",
   :chapter=>"6",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"9",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"17",
   :section=>"new_testament"}]

@order = ['Matthew', 'Mark', 'Acts', '1John']

def bench_em(n)
  arr = (@bible*((n/@bible.size.to_f).ceil))[0,n].shuffle
  puts "arr contains #{n} elements"
  compare do 
    _sort       { arr.sort { |h1,h2| @order.index(h1[:book]) <=>
                  @order.index(h2[:book]) }.size }
    _sort_by    { arr.sort_by { |h| @order.find_index(h[:book]) }.size }
    _sort_by_with_hash {ord=@order.each.with_index.to_h;
                        arr.sort_by {|b| ord[b[:book]]}.size}    
    _values_at  { arr.group_by { |h| h[:book] }.values_at(*@order).flatten.size }
  end
end

@maxpleaner,@ ChaitanyaKale和@Michael Kohl分别贡献了_sort_sort_bysort_by_with_hash

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _values_at
_values_at is faster than _sort by 2x ± 1.0

bench_em  1_000
arr contains 1000 elements
Running each test 16 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is faster than _sort_by by 10.000000000000009% ± 10.0%
_sort_by is faster than _sort by 2x ± 0.1

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

这是第二轮。

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em  1_000
arr contains 1000 elements
Running each test 8 times. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2.2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 1.0

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

答案 2 :(得分:3)

documentation可以看出,Array#index确实对字符串起作用(甚至是提供的示例),所以这可行:

books.sort_by { |b| array.index(b[:book]) }

但是,您不必反复搜索array,而只需确定订单一次,然后查找:

order = array.each.with_index.to_h
#=> { "Matthew" => 0, "Mark" => 1, "Acts" => 2, "1John" => 3 }
books.sort_by { |b| order[b[:book]] }

答案 3 :(得分:2)

由于Array#sort_by的描述接受了一个块。该块应返回-1,0或+1,具体取决于a和b之间的比较。您可以使用find_index上的array进行此类比较。

array_of_hashes.sort_by {|a| array.find_index(a[:book]) }应该可以解决问题。

答案 4 :(得分:0)

您的错误是认为您正在排序。但是,实际上,您还没有,已经有了命令,只需要放置元素即可。我并不是在提出一个紧凑或最佳的解决方案,而是一个简单的解决方案。首先将大型数组转换为由:book键索引的哈希(应该是您的第一个数据结构),然后只需使用map

array = ['Matthew', 'Mark', 'Acts', '1John']
elements = [{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]
by_name = {}
for e in elements
  by_name[e[:book]] = e
end
final = array.map { |x| by_name[x] }