列表的“完全连接”方法

时间:2017-07-26 20:38:01

标签: scala scala-collections

假设我们列出了某些对象的Seq[A]Seq[B]两个列表,并希望在某个条件(A, B) => Boolean上加入它们。可能类似于第一个列表中的一个元素,第二个元素中存在多个匹配元素。如果谈到full join,我们的意思是我们也想知道两个列表中哪些元素没有相应的对。

所以签名将是:

def fullJoin[A, B](left: Seq[A], right: Seq[B], joinCondition: (A, B) => Boolean): (Seq[A], Seq[B], Seq[(A, B)])

或者,如果我们利用Cats'Ior类型:

def fullJoin[A, B](left: Seq[A], right: Seq[B], joinCondition: (A, B) => Boolean): Seq[Ior[A, B]]

示例:

scala> fullJoin[Int, Int](List(1,2), List(3,4,4), {_ * 2 == _ })
res4: (Seq[Int], Seq[Int], Seq[(Int, Int)]) = (List(1),List(3),List((2,4), (2,4)))

这个想法与在SQL中连接表的想法完全相同。

问题是标准库中是否有任何类似的实用方法。如果没有,让我们讨论一个优雅的解决方案 - 首先,性能不是问题(二次复杂度很好,就像嵌套循环一样)。

2 个答案:

答案 0 :(得分:3)

这是一个利用内置scala库功能更简洁的解决方案:

def fullJoin[A, B](left: Seq[A], right: Seq[B], joinCondition: (A, B) => Boolean): (Seq[A], Seq[B], Seq[(A, B)]) = {
  val matched = for (a <- left; b <- right if joinCondition(a, b)) yield (a, b)
  val matchedLeft = matched.map(_._1).toSet
  val matchedRight = matched.map(_._2).toSet
  (left.filterNot(matchedLeft.contains), right.filterNot(matchedRight.contains), matched)
}

答案 1 :(得分:0)

我认为完全加入问题可以通过左连接来解决。没有真正优化,但这是我的解决方案:

  def fullJoin[A, B](left: Seq[A], right: Seq[B], joinCondition: (A, B) => Boolean): (Seq[A], Seq[B], Seq[(A, B)]) = {
    val (notJoinedLeft, joined) = leftJoin(left, right, joinCondition)
    val (notJoinedRight, _)     = leftJoin(right, left, (b: B, a: A) => joinCondition(a, b))
    (notJoinedLeft, notJoinedRight, joined)
  }

  def leftJoin[A, B](left: Seq[A], right: Seq[B], joinCondition: (A, B) => Boolean): (Seq[A], Seq[(A, B)]) = {
    val matchingResult: Seq[Either[A, Seq[(A, B)]]] = for {
      a <- left
    } yield {
      right.filter(joinCondition.curried(a)) match {
        case Seq()             => Left(a)
        case matchedBs: Seq[B] => Right(matchedBs.map((a, _)))
      }
    }
    val (notMatched: Seq[A], matched: Seq[Seq[(A, B)]]) = partition(matchingResult)
    (notMatched, matched.flatten)
  }

  def partition[A, B](list: Seq[Either[A, B]]): (Seq[A], Seq[B]) = {
    val (lefts, rights) = list.partition(_.isLeft)
    (lefts.map(_.left.get), rights.map(_.right.get))
  }