给定较宽的单个值序列,可将其细分为连续的子序列列表

时间:2019-01-24 04:01:58

标签: scala

我上了这个课:

case class websiteVisitsWindow(start: Int, end: Int, visitors: Int) {
 def contains(other: websiteVisitsWindow): Boolean = other.start >= this.start && other.end <= this.end
}

例如,给出一系列websiteVisitsWindow

Seq(websiteVisitsWindow(start = 1, end = 3, visitors = 5)

我想将此Seq分成相等的子序列,如下所示:

Seq(
websiteVisitsWindow(start = 1, end = 1, visitors = 5), 
websiteVisitsWindow(start = 2, end = 2, visitors = 5), 
websiteVisitsWindow(start = 3, end = 3, visitors = 5)
)

visitors的值在此阶段并不重要。我有一个基本的解决方案,可以解决针对非宽幅websiteVisitsWindow的问题(给定a和b序列生成3组,set 1:a和b的交集,set2:a和b的左连接,set3:b和a的右连接)

def splitIntoSets(as: Seq[websiteVisitsWindow], bs: 
Seq[websiteVisitsWindow]): (Seq[websiteVisitsWindow], 
Seq[websiteVisitsWindow], Seq[websiteVisitsWindow]) = {
(as, bs) match {
  case (Nil, Nil) => (Nil, Nil, Nil)
  case (_, Nil) => (Nil, as, Nil)
  case (Nil, _) => (Nil, bs, Nil)

  case _ =>
    if (bs.forall(currentItem => currentItem.start == currentItem.end)
      (
        as.filter(a => bs.exists(b => b.contains(a))),
        as.filter(a => !(bs.exists(b => b.contains(a)))),
        bs.filter(b => !(as.exists(a => a.contains(b))))
      )
    else
    {
      //for each Bs that is a wide websiteVisitsWindow break it down to a sub-sequence of smaller websiteVisitsWindow
      // i,e, websiteVisitsWindow(start = 1, end = 2, visitors = 10) => Seq(websiteVisitsWindow(start = 1, end = 1, visitors = 10), websiteVisitsWindow(start = 2, end = 2, visitors = 10))

      splitIntoSets(as, b +: bs)
    }

}
}

我不确定这是否是正确的方法,但是我目前正在考虑类似这样的东西(伪代码):

- Find the number of sequences to generate (essentially grabbing start and end values)
- Generate an websiteVisitWindow for each item in range from start value to end value as above such that the start and end values are set as the current value in the range
- Append this websiteVisitWindow to the list of bs
- Iterate over the wide interval, breaking it down until the end condition is met

1 个答案:

答案 0 :(得分:0)

我找到了一个具有递归函数的解决方案,花了我一段时间,但最终使它起作用。

def splitIntoSets(visitors: Seq[websiteVisitsWindow], schedule: Seq[websiteVisitsWindow]): (Seq[websiteVisitsWindow], Seq[websiteVisitsWindow], Seq[websiteVisitsWindow]) = {
(visitors, schedule) match {
  case (Nil, Nil) => (Nil, Nil, Nil)
  case (_, Nil) => (Nil, visitors, Nil)
  case (Nil, _) => (Nil, schedule, Nil)

  case _ =>
    if (schedule.forall(currentItem => currentItem.start == currentItem.end)) {

      (
        visitors.filter(visitor => schedule.exists(scheduleWebsiteVisitsWindow => scheduleWebsiteVisitsWindow.contains(visitor))),
        visitors.filter(visitor => !(schedule.exists(scheduleWebsiteVisitsWindow => scheduleWebsiteVisitsWindow.contains(visitor)))),
        schedule.filter(scheduleWebsiteVisitsWindow => !(visitors.exists(visitor => visitor.contains(scheduleWebsiteVisitsWindow))))
      )
    }
    else {

      val (sameStartEndTimes, wideStartEndTimes) = schedule.partition(websiteVisitsWindow => websiteVisitsWindow.start == websiteVisitsWindow.end)

      val startTime = wideStartEndTimes.map(firstValue => firstValue.start).toSeq(0).toInt
      val endTime = wideStartEndTimes.map(firstValue => firstValue.end).toSeq(0).toInt

      val startEndTimeList = (startTime to endTime).toList

      val expandedWideWebsiteVisitsWindow = startEndTimeList.map(
        currentItem => websiteVisitsWindow(start = currentItem, end = currentItem, visitors = 0)
      )

      val remainingWideWebsiteVisitsWindows = wideStartEndTimes.tail

      splitIntoSets(visitors, (sameStartEndTimes ++ expandedWebsiteVisitsWindow ++ remainingWideWebsiteVisitsWindows))
    }

}
}