我正试图从之前到之后的状态。是否有方便的Haskell函数从列表中删除重复的元组?或者它可能有点复杂,例如遍历整个列表?
Before: the list of tuples, sorted by word, as in
[(2,"a"), (1,"a"), (1,"b"), (1,"b"), (1,"c"), (2,"dd")]
After: the list of sorted tuples with exact duplicates removed, as in
[(2,"a"), (1,"a"), (1,"b"), (1,"c"), (2,"dd")]
答案 0 :(得分:6)
在hoogle上搜索Eq a => [a] -> [a]
,返回nub
函数:
nub函数从列表中删除重复的元素。特别是,它只保留每个元素的第一次出现。 (名称nub的意思是“本质'。”
与文档中一样,更一般的情况是nubBy
。
那就是说,这是一个O(n^2)
算法,可能效率不高。如果值是Ord
类型类的实例,则可以使用Data.Set.fromList
,如下所示:
import qualified Data.Set as Set
nub' :: Ord a => [a] -> [a]
nub' = Set.toList . Set.fromList
虽然这将不维持原始列表的顺序。
维护原始列表的顺序的简单设置样式解决方案可以是:
import Data.Set (Set, member, insert, empty)
nub' :: Ord a => [a] -> [a]
nub' = reverse . fst . foldl loop ([], empty)
where
loop :: Ord a => ([a], Set a) -> a -> ([a], Set a)
loop acc@(xs, obs) x
| x `member` obs = acc
| otherwise = (x:xs, x `insert` obs)
答案 1 :(得分:4)
如果您要为nub
定义Ord
版本,建议您使用
nub' :: Ord a => [a] -> [a]
nub' xs = foldr go (`seq` []) xs empty
where
go x r obs
| x `member` obs = r obs
| otherwise = obs' `seq` x : r obs'
where obs' = x `insert` obs
要了解这是做什么的,你可以摆脱foldr
:
nub' :: Ord a => [a] -> [a]
nub' xs = nub'' xs empty
where
nub'' [] obs = obs `seq` []
nub'' (y : ys) obs
| y `member` obs = nub'' ys obs
| otherwise = obs' `seq` y : nub'' ys obs'
where obs' = y `insert` obs
关于此实现的一个关键点,而不是 behzad.nouri,是因为它们被消耗,它会懒洋洋地产生元素。这对于缓存利用率和垃圾收集来说通常要好得多,并且使用比反转算法更少的常量因子内存。