Question

如果我有一个范围，我怎样才能将它分成一系列连续的子范围，其中指定了子范围（桶）的数量？如果没有足够的物品，应省略空桶。

例如：

splitRange(1 to 6, 3) == Seq(Range(1,2), Range(3,4), Range(5,6))
splitRange(1 to 2, 3) == Seq(Range(1), Range(2))

一些额外的限制，排除了我见过的一些解决方案：

大致均匀的铲斗尺寸 - 铲斗尺寸最多应为1，
输入范围的长度有时可能非常大，因此不应将范围显示为序列（例如，不能使用grouped）
这也意味着我们不会以循环方式将数字分配给存储桶，因为每个存储桶中的数字不会是连续的，因此不会形成范围
理想情况下，子范围将按顺序生成，即（1,2）（3,4），而不是（3,4）（1,2）

一位同事找到了解决方案here：

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  if (r.step != 1) 
      throw new IllegalArgumentException("Range must have step size equal to 1")

  val nchunks = scala.math.max(chunks, 1)
  val chunkSize = scala.math.max(r.length / nchunks, 1)
  val starts = r.by(chunkSize).take(nchunks)
  val ends = starts.map(_ - 1).drop(1) :+ r.end
  starts.zip(ends).map(x => x._1 to x._2)
}

但是当N很小时，这会产生非常不均匀的铲斗尺寸，例如：

splitRange(1 to 14, 5)                          
//> Vector(Range(1, 2), Range(3, 4), Range(5, 6),
//|        Range(7, 8), Range(9, 10, 11, 12, 13, 14))
                              ^^^^^^^^^^^^^^^^^^^^^

Answer 1

浮点方法

一种方法是为每个桶生成一个小数（浮点）偏移量，然后通过压缩将它们转换为整数范围。空范围也需要使用collect过滤掉。

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val m = r.length.toDouble
  val chunkSize = m / chunks
  val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
  val pairs = bins zip (bins.tail)
  pairs.collect { case (a, b) if b > a => a to b }
}

（此解决方案的第一个版本存在舍入问题，导致无法处理Int.MaxValue - 现在已根据Rex Kerr的递归浮点解决方案修复了这个问题。

另一种浮点方法是递减范围，每次都偏离范围，所以我们不能错过任何元素。此版本可以正确处理Int.MaxValue。

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val chunkSize = r.length.toDouble / chunks

  def go(i: Int, r: Range, delta: Double, acc: List[Range]): List[Range] = {  
    if (i == chunks) r :: acc 
      // ensures the last chunk has all remaining values, even if error accumulates
    else {
      val s = delta + chunkSize
      val (chunk, rest) = r.splitAt(s.toInt)
      go(i + 1, rest, s - s.toInt, if (chunk.length > 0) chunk :: acc else acc)
    }
  }

  go(1, r, 0.0D, Nil).reverse
}

还可以递归以生成（开始，结束）对，而不是压缩它们。这是改编自Rex Kerr的answer to a similar question

def splitRange(r: Range, chunks: Int): Seq[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val m = r.length
  val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
  def snip(r: Range, ns: Seq[Int], got: Vector[Range]): Vector[Range] = {
    if (ns.length < 2) got
    else {
      val (i, j) = (ns.head, ns.tail.head)
      snip(r.drop(j - i), ns.tail, got :+ r.take(j - i))
    }
  }
 snip(r, bins, Vector.empty).filter(_.length > 0)
}

整数方法

最后，我意识到这可以通过调整Bresenham's line-drawing algorithm纯粹的整数运算来完成，它解决了一个基本上等价的问题 - 如何在y行中均匀分配x像素，只使用整数运算！

我最初使用var和ArrayBuffer将伪代码转换为命令式解决方案，然后将其转换为尾递归解决方案：

def splitRange(r: Range, chunks: Int): List[Range] = {
  require(r.step == 1, "Range must have step size equal to 1")
  require(chunks >= 1, "Must ask for at least 1 chunk")

  val dy = r.length
  val dx = chunks

  @tailrec
  def go(y0:Int, y:Int, d:Int, ch:Int, acc: List[Range]):List[Range] = {
    if (ch == 0) acc
    else {
      if (d > 0) go(y0, y-1, d-dx, ch, acc)
      else go(y-1, y, d+dy, ch-1, if (y > y0) acc 
                                  else (y to y0) :: acc)
    }
  }

  go(r.end, r.end, dy - dx, chunks, Nil)
}

请参阅维基百科链接以获得完整的解释，但基本上算法会缩小一条线的斜率，或者添加y范围dy并减去x范围dx。如果它们没有精确划分，则会累积误差，直到它完全分开，从而导致某些子范围内出现额外的像素。

splitRange(3 to 15, 5)                         
//> List(Range(3, 4), Range(5, 6, 7), Range(8, 9), 
//|      Range(10, 11, 12), Range(13, 14, 15))

将Scala范围拆分为均匀大小的连续子范围

1 个答案: