Question

我需要Ruby习语来对两个字段进行排序。在Python中，如果对两元素元组的列表进行排序，它将根据第一个元素进行排序，如果两个元素相等，则排序基于第二个元素。

一个例子是Python中的以下排序代码（从最长到最短的单词排序，并考虑断开关系的第二个元素）来自http://www.pythonlearn.com/html-008/cfbook011.html

txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = list()
for word in words:
   t.append((len(word), word))

t.sort(reverse=True)

res = list()
for length, word in t:
    res.append(word)

print res

我在Ruby中提到的是以下使用结构

的代码

txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = []

tuple = Struct.new(:len, :word)
for word in words
    tpl = tuple.new
    tpl.len = word.length
    tpl.word =  word
    t << tpl
end

t = t.sort {|a, b| a[:len] == b[:len] ?
    b[:word] <=> a[:word] : b[:len] <=> a[:len]
    }

res = []
for x in t
    res << x.word
 end

puts res

我想知道是否有更好的方法（更少的代码）来实现这种稳定的排序。

Answer 1

我认为你已经过度思考了这一点。

txt = 'but soft what light in yonder window breaks'

lengths_words = txt.split.map {|word| [ word.size, word ] }
# => [ [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], [ 5, "light" ], ... ]

sorted = lengths_words.sort
# => [ [ 2, "in" ], [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], ... ]

如果你真的想使用Struct，你可以：

tuple = Struct.new(:length, :word)

tuples = txt.split.map {|word| tuple.new(word.size, word) }
# => [ #<struct length=3, word="but">, #<struct length=4, word="soft">, ... ]

sorted = tuples.sort_by {|tuple| [ tuple.length, tuple.word ] }
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]

这相当于：

sorted = tuples.sort {|tuple, other| tuple.length == other.length ?
                                       tuple.word <=> other.word : tuple.length <=> other.length }

（请注意，这次是sort，而不是sort_by。）

...但是由于我们使用了Struct，我们可以通过定义我们自己的比较运算符（<=>）来使这更好，sort将调用它（在任何Ruby中都可以使用）类）：

tuple = Struct.new(:length, :word) do
  def <=>(other)
    [ length, word ] <=> [ other.length, other.word ]
  end
end

tuples = txt.split.map {|word| tuple.new(word.size, word) }
tuples.sort
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]

还有其他选项可用于更复杂的排序。如果你想先获得最长的单词，例如：

lengths_words = txt.split.map {|word| [ word.size, word ] }
sorted = lengths_words.sort_by {|length, word| [ -length, word ] }
# => [ [ 6, "breaks" ], [ 6, "window" ], [ 6, "yonder" ], [ 5, "light" ], ... ]

或者：

tuple = Struct.new(:length, :word) do
  def <=>(other)
    [ -length, word ] <=> [ -other.length, other.word ]
  end
end

txt.split.map {|word| tuple.new(word.size, word) }.sort
# => [ #<struct length=6, word="breaks">, #<struct length=6, word="window">, #<struct length=6, word="yonder">, ... ]

正如您所看到的，我非常依赖Ruby的内置功能来根据内容对数组进行排序，但您也可以“自己动手”＃34;如果您愿意，可能会对许多项目表现更好。这是一种比较方法，它等同于您的t.sort {|a, b| a[:len] == b[:len] ? ... }代码（加上奖励to_s方法）：

tuple = Struct.new(:length, :word) do
  def <=>(other)
    return word <=> other.word if length == other.length
    length <=> other.length
  end

  def to_s
    "#{word} (#{length})"
  end
end

sorted = txt.split.map {|word| tuple.new(word.size, word) }.sort
puts sorted.join(", ")
# => in (2), but (3), soft (4), what (4), light (5), breaks (6), window (6), yonder (6)

最后，有几条评论你的Ruby风格：

你几乎从未在惯用的Ruby代码中看到for。 each是在Ruby中进行几乎所有迭代的惯用方法，并且＆＃34; functional＆＃34; map，reduce和select等方法也很常见。从不for。
Struct的一大优势是您可以获得每个属性的访问者方法，因此您可以tuple.word代替tuple[:word]。
没有参数的方法在没有括号的情况下调用：txt.split.map，而不是txt.split().map

Answer 2

Ruby使用Enumerable#sort_by将和Array#<=>进行排序，从而轻松实现这一目标。

def sort_on_two(arr, &proc)
  arr.map.sort_by { |e| [proc[e], e] }.reverse
end

txt = 'but soft what light in yonder window breaks'

sort_on_two(txt.split) { |e| e.size }
  #=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]

sort_on_two(txt.split) { |e| e.count('aeiou') }
  #=> ["yonder", "window", "breaks", "what", "soft", "light", "in", "but"]

sort_on_two(txt.split) { |e| [e.count('aeiou'), e.size] }
  #=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]

请注意，在最新版本的Ruby中，proc.call(e)可以写成proc[e]，proc.yield(e)或proc.(e)。

Answer 3

更新：我的第一个答案是错的（这一次！），感谢@mu太短评论

您的代码可以按两个条件排序，但如果您只想获得相同的结果，最好是执行以下操作：

txt.split.sort_by{|a| [a.size,a] }.reverse
=> ["breaks", "window", "yonder", "light", "soft", "what", "but", "in"]

第一次检查将使用size运算符，如果结果为零，则使用第二次检查....

如果你真的想保留你的数据结构，那就是同样的原则：

t.sort_by{ |a| [a[:len],a[:word]] }.reverse

用于在两个字段上排序的Ruby习语

3 个答案: