从两个序列中过滤掉常用条目,返回两个包含原始唯一条目的列表

时间:2017-12-06 18:19:57

标签: scala sequences

我想从两个列表中过滤掉常用元素,然后左边有两个包含剩余列表的列表。

我在scala中有两个序列:

val firstSeq = Seq(1,2,3,4)
val secondSeq = Seq(1,3,5,7)

我想要做的是过滤所有常见元素,这意味着我最终得到:

filteredFirstSeq = Seq(2,4)
filteredSecondSeq = Seq(5,7)

所以,有一个简单的方法来scala:

val filteredFirstSeq = firstSeq.filterNot(firstEntry => secondSeq.contains(firstEntry))
val filteredSecondSeq = secondSeq.filterNot(secondEntry => firstSeq.contains(secondEntry))

但是!这意味着我必须遍历整个第一个列表,并匹配整个第二个列表,并匹配,这需要很长时间,当列表很大,并且条目比整数更复杂!

我更喜欢只需要循环一次,但我能想到的唯一方法就是拥有可变列表,并在找到匹配时从两者中删除一个值。但这看起来有点令人讨厌。我确定必须有一个微不足道的答案我错过了。

感谢您的任何建议!

1 个答案:

答案 0 :(得分:2)

此示例假定每个列表都不包含重复项,如果不是这样,那么折叠内部的逻辑将不得不稍微改变。

val firstSeq = Seq(1,2,3,4)
val secondSeq = Seq(1,3,5,7)

// Put everything into a list, keeping track of where things came from
val both: Seq[(Int, Int)] = firstSeq.map(x => (x, 1)) ++ secondSeq.map(x => (x, 2))

// Reduce the list into a single map, where the keys are the numbers, and the value is the originating seq.  Anytime we try to insert a value that already is in the map, we remove the value instead, since that will mean the value was in each sequence.
val map: Map[Int, Int] = both.foldLeft(Map.empty[Int, Int]) { (map, tuple) =>
  val (value, seqNumber) = tuple
  if (map.contains(value)) {
    map - value
  } else {
    map + (value -> seqNumber)
  }
}

// Now partition the values back into their original lists
val (firstSeqFiltered, secondSeqFiltered) = map.partition(_._2 == 1)
println(firstSeqFiltered.keys)
println(secondSeqFiltered.keys)

输出:

Set(2, 4)
Set(5, 7)