用于在两个字段上排序的Ruby习语

时间:2014-10-22 17:15:59

标签: ruby sorting

我需要Ruby习语来对两个字段进行排序。在Python中,如果对两元素元组的列表进行排序,它将根据第一个元素进行排序,如果两个元素相等,则排序基于第二个元素。

一个例子是Python中的以下排序代码(从最长到最短的单词排序,并考虑断开关系的第二个元素)来自http://www.pythonlearn.com/html-008/cfbook011.html

txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = list()
for word in words:
   t.append((len(word), word))

t.sort(reverse=True)

res = list()
for length, word in t:
    res.append(word)

print res

我在Ruby中提到的是以下使用结构

的代码
txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = []

tuple = Struct.new(:len, :word)
for word in words
    tpl = tuple.new
    tpl.len = word.length
    tpl.word =  word
    t << tpl
end

t = t.sort {|a, b| a[:len] == b[:len] ?
    b[:word] <=> a[:word] : b[:len] <=> a[:len]
    }

res = []
for x in t
    res << x.word
 end

puts res

我想知道是否有更好的方法(更少的代码)来实现这种稳定的排序。

3 个答案:

答案 0 :(得分:5)

我认为你已经过度思考了这一点。

txt = 'but soft what light in yonder window breaks'

lengths_words = txt.split.map {|word| [ word.size, word ] }
# => [ [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], [ 5, "light" ], ... ]

sorted = lengths_words.sort
# => [ [ 2, "in" ], [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], ... ]

如果你真的想使用Struct,你可以:

tuple = Struct.new(:length, :word)

tuples = txt.split.map {|word| tuple.new(word.size, word) }
# => [ #<struct length=3, word="but">, #<struct length=4, word="soft">, ... ]

sorted = tuples.sort_by {|tuple| [ tuple.length, tuple.word ] }
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]

这相当于:

sorted = tuples.sort {|tuple, other| tuple.length == other.length ?
                                       tuple.word <=> other.word : tuple.length <=> other.length }

(请注意,这次是sort,而不是sort_by。)

...但是由于我们使用了Struct,我们可以通过定义我们自己的比较运算符(<=>)来使这更好,sort将调用它(在任何Ruby中都可以使用)类):

tuple = Struct.new(:length, :word) do
  def <=>(other)
    [ length, word ] <=> [ other.length, other.word ]
  end
end

tuples = txt.split.map {|word| tuple.new(word.size, word) }
tuples.sort
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]

还有其他选项可用于更复杂的排序。如果你想先获得最长的单词,例如:

lengths_words = txt.split.map {|word| [ word.size, word ] }
sorted = lengths_words.sort_by {|length, word| [ -length, word ] }
# => [ [ 6, "breaks" ], [ 6, "window" ], [ 6, "yonder" ], [ 5, "light" ], ... ]

或者:

tuple = Struct.new(:length, :word) do
  def <=>(other)
    [ -length, word ] <=> [ -other.length, other.word ]
  end
end

txt.split.map {|word| tuple.new(word.size, word) }.sort
# => [ #<struct length=6, word="breaks">, #<struct length=6, word="window">, #<struct length=6, word="yonder">, ... ]

正如您所看到的,我非常依赖Ruby的内置功能来根据内容对数组进行排序,但您也可以“自己动手”#34;如果您愿意,可能会对许多项目表现更好。这是一种比较方法,它等同于您的t.sort {|a, b| a[:len] == b[:len] ? ... }代码(加上奖励to_s方法):

tuple = Struct.new(:length, :word) do
  def <=>(other)
    return word <=> other.word if length == other.length
    length <=> other.length
  end

  def to_s
    "#{word} (#{length})"
  end
end

sorted = txt.split.map {|word| tuple.new(word.size, word) }.sort
puts sorted.join(", ")
# => in (2), but (3), soft (4), what (4), light (5), breaks (6), window (6), yonder (6)

最后,有几条评论你的Ruby风格:

  1. 你几乎从未在惯用的Ruby代码中看到foreach是在Ruby中进行几乎所有迭代的惯用方法,并且&#34; functional&#34; mapreduceselect等方法也很常见。从不for

  2. Struct的一大优势是您可以获得每个属性的访问者方法,因此您可以tuple.word代替tuple[:word]

  3. 没有参数的方法在没有括号的情况下调用:txt.split.map,而不是txt.split().map

答案 1 :(得分:1)

Ruby使用Enumerable#sort_by将和Array#<=>进行排序,从而轻松实现这一目标。

def sort_on_two(arr, &proc)
  arr.map.sort_by { |e| [proc[e], e] }.reverse
end

txt = 'but soft what light in yonder window breaks'

sort_on_two(txt.split) { |e| e.size }
  #=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]

sort_on_two(txt.split) { |e| e.count('aeiou') }
  #=> ["yonder", "window", "breaks", "what", "soft", "light", "in", "but"]

sort_on_two(txt.split) { |e| [e.count('aeiou'), e.size] }
  #=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]

请注意,在最新版本的Ruby中,proc.call(e)可以写成proc[e]proc.yield(e)proc.(e)

答案 2 :(得分:0)

更新:我的第一个答案是错的(这一次!),感谢@mu太短评论

您的代码可以按两个条件排序,但如果您只想获得相同的结果,最好是执行以下操作:

txt.split.sort_by{|a| [a.size,a] }.reverse
=> ["breaks", "window", "yonder", "light", "soft", "what", "but", "in"]

第一次检查将使用size运算符,如果结果为零,则使用第二次检查....

如果你真的想保留你的数据结构,那就是同样的原则:

t.sort_by{ |a| [a[:len],a[:word]] }.reverse