将Scala范围拆分为均匀大小的连续子范围

时间:2017-01-17 21:52:59

标签: scala range

如果我有一个范围,我怎样才能将它分成一系列连续的子范围,其中指定了子范围(桶)的数量?如果没有足够的物品,应省略空桶。

例如:

splitRange(1 to 6, 3) == Seq(Range(1,2), Range(3,4), Range(5,6))
splitRange(1 to 2, 3) == Seq(Range(1), Range(2))

一些额外的限制,排除了我见过的一些解决方案:

  1. 大致均匀的铲斗尺寸 - 铲斗尺寸最多应为1,
  2. 输入范围的长度有时可能非常大,因此不应将范围显示为序列(例如,不能使用grouped
  3. 这也意味着我们不会以循环方式将数字分配给存储桶,因为每个存储桶中的数字不会是连续的,因此不会形成范围
  4. 理想情况下,子范围将按顺序生成,即(1,2)(3,4),而不是(3,4)(1,2)
  5. 一位同事找到了解决方案here

    def splitRange(r: Range, chunks: Int): Seq[Range] = {
      if (r.step != 1) 
          throw new IllegalArgumentException("Range must have step size equal to 1")
    
      val nchunks = scala.math.max(chunks, 1)
      val chunkSize = scala.math.max(r.length / nchunks, 1)
      val starts = r.by(chunkSize).take(nchunks)
      val ends = starts.map(_ - 1).drop(1) :+ r.end
      starts.zip(ends).map(x => x._1 to x._2)
    }
    

    但是当N很小时,这会产生非常不均匀的铲斗尺寸,例如:

    splitRange(1 to 14, 5)                          
    //> Vector(Range(1, 2), Range(3, 4), Range(5, 6),
    //|        Range(7, 8), Range(9, 10, 11, 12, 13, 14))
                                  ^^^^^^^^^^^^^^^^^^^^^
    

1 个答案:

答案 0 :(得分:5)

浮点方法

一种方法是为每个桶生成一个小数(浮点)偏移量,然后通过压缩将它们转换为整数范围。空范围也需要使用collect过滤掉。

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val m = r.length.toDouble
  val chunkSize = m / chunks
  val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
  val pairs = bins zip (bins.tail)
  pairs.collect { case (a, b) if b > a => a to b }
}

(此解决方案的第一个版本存在舍入问题,导致无法处理Int.MaxValue - 现在已根据Rex Kerr的递归浮点解决方案修复了这个问题。

另一种浮点方法是递减范围,每次都偏离范围,所以我们不能错过任何元素。此版本可以正确处理Int.MaxValue

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val chunkSize = r.length.toDouble / chunks

  def go(i: Int, r: Range, delta: Double, acc: List[Range]): List[Range] = {  
    if (i == chunks) r :: acc 
      // ensures the last chunk has all remaining values, even if error accumulates
    else {
      val s = delta + chunkSize
      val (chunk, rest) = r.splitAt(s.toInt)
      go(i + 1, rest, s - s.toInt, if (chunk.length > 0) chunk :: acc else acc)
    }
  }

  go(1, r, 0.0D, Nil).reverse
} 

还可以递归以生成(开始,结束)对,而不是压缩它们。这是改编自Rex Kerr的answer to a similar question

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val m = r.length
  val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
  def snip(r: Range, ns: Seq[Int], got: Vector[Range]): Vector[Range] = {
    if (ns.length < 2) got
    else {
      val (i, j) = (ns.head, ns.tail.head)
      snip(r.drop(j - i), ns.tail, got :+ r.take(j - i))
    }
  }
 snip(r, bins, Vector.empty).filter(_.length > 0)
}

整数方法

最后,我意识到这可以通过调整Bresenham's line-drawing algorithm纯粹的整数运算来完成,它解决了一个基本上等价的问题 - 如何在y行中均匀分配x像素,只使用整数运算!

我最初使用varArrayBuffer将伪代码转换为命令式解决方案,然后将其转换为尾递归解决方案:

def splitRange(r: Range, chunks: Int): List[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val dy = r.length
  val dx = chunks

  @tailrec
  def go(y0:Int, y:Int, d:Int, ch:Int, acc: List[Range]):List[Range] = {
    if (ch == 0) acc
    else {
      if (d > 0) go(y0, y-1, d-dx, ch, acc)
      else go(y-1, y, d+dy, ch-1, if (y > y0) acc 
                                  else (y to y0) :: acc)
    }
  }

  go(r.end, r.end, dy - dx, chunks, Nil)
}

请参阅维基百科链接以获得完整的解释,但基本上算法会缩小一条线的斜率,或者添加y范围dy并减去x范围dx。如果它们没有精确划分,则会累积误差,直到它完全分开,从而导致某些子范围内出现额外的像素。

splitRange(3 to 15, 5)                         
//> List(Range(3, 4), Range(5, 6, 7), Range(8, 9), 
//|      Range(10, 11, 12), Range(13, 14, 15))