Scala:如何有效地交叉两组,同时考虑元素频率

时间:2017-01-11 19:05:06

标签: python scala

请考虑python中的以下步骤:

>>> from collections import Counter
>>> A = ["a", "b", "d", "d"]
>>> Counter(A)
Counter({'d': 2, 'a': 1, 'b': 1})
>>> B = ["a", "a", "b",  "d", "d"]
>>> Counter(B)
Counter({'a': 2, 'd': 2, 'b': 1})
>>> common = Counter(A) & Counter(B)
>>> common
Counter({'d': 2, 'a': 1, 'b': 1})
>>> sum(common.values())
4

我想写这些步骤的Scala等价物。 由于Scala没有mutli-set,我可能不想遵循确切的路径。 (或者我不确定使用第三方多人设备的速度有多慢)。 Scala Set不起作用,因为它们不保留频率。但我可以混合使用Set和地图来保持计数。

scala> val A = Seq("a", "b", "d", "d")
A: Seq[String] = List(a, b, d, d)

scala> val B = Seq("a", "a", "b",  "d", "d")
B: Seq[String] = List(a, a, b, d, d)

scala> val AWordFreqMap = A.groupBy(a => a).map{case (k, v) => (k -> v.length)}
AWordFreqMap: scala.collection.immutable.Map[String,Int] = Map(b -> 1, d -> 2, a -> 1)

scala> val BWordFreqMap = B.groupBy(a => a).map{case (k, v) => (k -> v.length)}
BWordFreqMap: scala.collection.immutable.Map[String,Int] = Map(b -> 1, d -> 2, a -> 2)

scala> A.toSet.intersect(B.toSet).toList.map(i => scala.math.min(AWordFreqMap(i), BWordFreqMap(i)) ).sum
res2: Int = 4

我可以做得更简单吗?

0 个答案:

没有答案