使用Double.NaN的自定义比较器对Doubles列表进行排序

时间:2018-12-04 17:14:42

标签: scala list sorting apache-spark comparator

我有以下列表

scala>  List(Double.NaN, 0.0, 99.9, 34.2, 10.98, 7.0, 6.0, Double.NaN, 5.0, 2.0, 0.56, Double.NaN, 0.0, 10.0)
res0: List[Double] = List(NaN, 0.0, 99.9, 34.2, 10.98, 7.0, 6.0, NaN, 5.0, 2.0, 0.56, NaN, 0.0, 10.0)

这是我的比较器功能:

scala> def sortAscendingDouble(d1:Double, d2:Double) = {
     | if(d1.isNaN && !d2.isNaN)
     | d1 < d2
     | else if(!d1.isNaN && d2.isNaN)
     | d2 < d1
     | else d1< d2
     | }
sortAscendingDouble: (d1: Double, d2: Double)Boolean

我正尝试按以下方式使用sortWith:

scala> res0.sortWith((d1, d2)=> sortAscendingDouble(d1, d2))
res1: List[Double] = List(NaN, 0.0, 0.0, 0.56, 2.0, 5.0, 6.0, 7.0, 10.0, 10.98, 34.2, 99.9, NaN, NaN)

我不明白为什么第一个NaN不排在列表末尾。

我的升序排序列表的预期输出是:

List(0.0, 0.0, 0.56, 2.0, 5.0, 6.0, 7.0, 10.0, 10.98, 34.2, 99.9, NaN, NaN, NaN)

我对降序排序列表的预期输出是:

List(99.9, 34.2, 10.98, 10.0, 7.0, 6.0, 5.0, 2.0, 0.56, 0.0, 0.0, NaN, NaN, NaN

对于升序排序和降序排序,我都希望NaN在末尾出现。

我知道sortWith使我们能够编写自己的比较器。有人可以帮我吗?

1 个答案:

答案 0 :(得分:3)

问题在于,将任何数量的(包括NaN本身)与Nan进行比较将始终返回false。因此,您的第三个条件是错误的,因为d2 < d1将是false,但必须是true。您可以通过在特殊情况下为函数使用固定的返回值来解决此问题。

/** Compares two doubles and returns true if the first value is equals or less than the second */
def sortAscendingDouble(d1: Double, d2: Double): Boolean =
  if (d1.isNaN && d2.isNaN)
    false // doesn't matter if true or false.
  else if(d1.isNaN && !d2.isNaN)
    false // NaN always goes after any non-NaN double.
  else if(!d1.isNaN && d2.isNaN)
    true // NaN always goes after any non-NaN double.
  else
    d1 < d2 // Standard double comparison. This should take care of any repetitive Doubles

/** Compares two doubles and returns true if the first value is equals or greater than the second */
def sortDescendingDouble(d1: Double, d2: Double): Boolean =
  if (d1.isNaN && d2.isNaN)
    false // doesn't matter if true or false.
  else if(d1.isNaN && !d2.isNaN)
    false // NaN always goes after any non-NaN double.
  else if(!d1.isNaN && d2.isNaN)
    true // NaN always goes after any non-NaN double.
  else
    d1 > d2 // Standard double comparison. This should take care of any repetitive Doubles

list.sortWith(sortAscendingDouble)
// List[Double] = List(0.0, 0.0, 0.56, 2.0, 5.0, 6.0, 7.0, 10.0, 10.98, 34.2, 99.9, NaN, NaN, NaN)

list.sortWith(sortDescendingDouble)
// List[Double] = List(99.9, 34.2, 10.98, 10.0, 7.0, 6.0, 5.0, 2.0, 0.56, 0.0, 0.0, NaN, NaN, NaN)