I have a list of elements in Scala and I am looking for a way to split the list when a duplicate is found.
For example: List(x,y,z,e,r,y,g,a)
would be converted to List(List(x,y,z,e,r),List(y,g,a))
or List(x,y,z,x,y,z)
to List(x,y,z), List(x,y,z)
and List(x,y,z,y,g,x)
to List(x,y,z), List(y,g,x)
Is there a more efficient way than iterating and and cheking for every element separately?
答案 0 :(得分:2)
Quick and dirty O(n)
using O(n)
additional memory:
import scala.collection.mutable.HashSet
import scala.collection.mutable.ListBuffer
val list = List("x", "y", "z", "e", "r", "y", "g", "a", "x", "m", "z")
var result = new ListBuffer[ListBuffer[String]]()
var partition = new ListBuffer[String]()
list.foreach { i =>
if (partition.contains(i)) {
result += partition
partition = new ListBuffer[String]()
}
partition += i
}
if (partition.nonEmpty) {
result += partition
}
result
ListBuffer(ListBuffer(x, y, z, e, r), ListBuffer(y, g, a, x, m, z))
答案 1 :(得分:2)
这个解决方案有一些注意事项:
O(n^2)
更好,后者是蛮力。foldLeft
一点,这是一种自然的方式。O(n)
(累积)调用,实际上可能并不需要(取决于您使用它做了什么)。以下是代码:
def partition(ls: List[String]): List[ListSet[String]] = {
ls.foldLeft(List(ListSet.empty[String]))((partitionedLists, elem:String) => {
if(partitionedLists.head.contains(elem)) {
ListSet(elem) :: partitionedLists
} else {
(partitionedLists.head + elem) :: partitionedLists.tail
}
})
}
partition(List("x","y","z","e","r","y","g","a"))
// res0: List[scala.collection.immutable.ListSet[String]] = List(ListSet(r, e, z, y, x), ListSet(a, g, y))
我使用ListSet
来获得Set
和订购的好处,这适合您的使用案例。
foldLeft
是一个函数,它接受累加器值(在本例中为List(ListSet.empty[String])
)并在它移动通过集合元素时对其进行修改。如果我们将这个累加器构造成一个段列表,那么当我们完成它时,它将包含原始列表的所有有序段。
答案 2 :(得分:1)
一个语句尾递归版本(但由于列表中的contains
而效率不高)
var xs = List('x','y','z','e','r','y','g','a')
def splitAtDuplicates[A](splits: List[List[A]], right: List[A]): List[List[A]] =
if (right.isEmpty)// done
splits.map(_.reverse).reverse
else if (splits.head contains right.head) // need to split here
splitAtDuplicates(List()::splits, right)
else // continue building current sublist
splitAtDuplicates((right.head :: splits.head)::splits.tail, right.tail)
使用Set
加快速度,以跟踪我们目前所看到的内容:
def splitAtDuplicatesOptimised[A](seen: Set[A],
splits: List[List[A]],
right: List[A]): List[List[A]] =
if (right.isEmpty)
splits.map(_.reverse).reverse
else if (seen(right.head))
splitAtDuplicatesOptimised(Set(), List() :: splits, right)
else
splitAtDuplicatesOptimised(seen + right.head,
(right.head :: splits.head) :: splits.tail,
right.tail)
答案 3 :(得分:0)
You will basically need to iterate with a look-up table. I can provide help with the follwoing immutable and functional tailrec implementation.
import scala.collection.immutable.HashSet
import scala.annotation.tailrec
val list = List("x","y","z","e","r","y","g","a", "x", "m", "z", "ll")
def splitListOnDups[A](list: List[A]): List[List[A]] = {
@tailrec
def _split(list: List[A], cList: List[A], hashSet: HashSet[A], lists: List[List[A]]): List[List[A]] = {
list match {
case a :: Nil if hashSet.contains(a) => List(a) +: (cList +: lists)
case a :: Nil => (a +: cList) +: lists
case a :: tail if hashSet.contains(a) => _split(tail, List(a), hashSet, cList +: lists)
case a :: tail => _split(tail, a +: cList, hashSet + a, lists)
}
}
_split(list, List[A](), HashSet[A](), List[List[A]]()).reverse.map(_.reverse)
}
def splitListOnDups2[A](list: List[A]): List[List[A]] = {
@tailrec
def _split(list: List[A], cList: List[A], hashSet: HashSet[A], lists: List[List[A]]): List[List[A]] = {
list match {
case a :: Nil if hashSet.contains(a) => List(a) +: (cList +: lists)
case a :: Nil => (a +: cList) +: lists
case a :: tail if hashSet.contains(a) => _split(tail, List(a), HashSet[A](), cList +: lists)
case a :: tail => _split(tail, a +: cList, hashSet + a, lists)
}
}
_split(list, List[A](), HashSet[A](), List[List[A]]()).reverse.map(_.reverse)
}
splitListOnDups(list)
// List[List[String]] = List(List(x, y, z, e, r), List(y, g, a), List(x, m), List(z, ll))
splitListOnDups2(list)
// List[List[String]] = List(List(x, y, z, e, r), List(y, g, a, x, m, z, ll))