使用自定义比较器函数压缩两个列表的标准方法

时间:2014-02-10 13:49:03

标签: haskell

如果经常需要压缩两个列表,则丢弃不匹配的元素(通过比较列表元素的部分来定义“匹配”)。例如:

let as = [(1,"a"), (2,"b"), (4,"c")]
let bs = [(2,"a"), (3,"b"), (4,"c"), (5, "d")]
zipWithAdjust (fst) (fst) as bs
-- ~> [((2,"b"),(2,"a")), ((4,"c"),(4,"c"))]

我按如下方式实施了zipWithAdjust

zipWithAdjust :: (Ord c, Show a, Show b, Show c) => (a -> c) -> (b -> c) -> [a] -> [b] -> [(a,b)]
zipWithAdjust cmpValA cmpValB (a:as) (b:bs)
  | cmpValA a == cmpValB b = (a,b) : zipWithAdjust cmpValA cmpValB as bs
  | cmpValA a > cmpValB b = zipWithAdjust cmpValA cmpValB (a:as) bs
  | cmpValA a < cmpValB b = zipWithAdjust cmpValA cmpValB as (b:bs)
zipWithAdjust _ _ _ _ = []

它工作正常,但我觉得有一种标准的方法来做这种拉链。我找到了Data.Alignthis SO Question,但无法弄清楚如何将它用于我的用例。

有没有标准的方法(使用库函数)?是Data.Align吗?如果是这样,我如何使用Data.Align实现上述功能?

修改:更改了<案例以与实例匹配实现。

4 个答案:

答案 0 :(得分:5)

据我所知,没有such function。但是,您可以使用(a -> b -> Ordering)而不是两个附加功能来使您的功能更加通用:

zipWithAdjust :: (a -> b -> Ordering) -> [a] -> [b] -> [(a,b)]
zipWithAdjust cmp (a:as) (b:bs)
  | ord == LT = zipWithAdjust cmp as (b:bs)
  | ord == GT = zipWithAdjust cmp (a:as) (bs)
  | ord == EQ = (a,b) : zipWithAdjust cmp as bs
  where ord = cmp a b

zipWithAdjust _ _ _ = []

result =  zipWithAdjust (\x y -> compare (fst x) (fst y)) [(1,"a"), (2,"b"), (4,"c")] [(2,"a"), (3,"b"), (4,"c"), (5, "d")]

但是,我不再称之为zip,而是compareMerge或类似名称。

答案 1 :(得分:5)

您可能会喜欢Data.Map's intersection capabilities。它在某些方面的能力稍差,而在其他方面则更有能力。例如:

> let as = fromList [(1,"a"), (2,"b"), (4,"c")]; bs = fromList [(2,"a"), (3,"b"), (4,"c"), (5, "d")]
> intersectionWith (,) as bs
fromList [(2,("b","a")),(4,("c","c"))]

答案 2 :(得分:4)

我想说这是更惯用的说法

zipWithAdjust cmpA cmpB (a:as) (b:bs) =
    case cmpA a `compare` cmpB b of
        EQ -> (a, b) : zipWithAdjust cmpA cmpB    as     bs
        GT ->          zipWithAdjust cmpA cmpB (a:as)    bs
        LT ->          zipWithAdjust cmpA cmpB    as  (b:bs)
zipWithAdjust _ _ _ _ = []

肯定会更快,因为它会减少您计算cmpA acmpB b的次数。这不是真正的拉链,因为您同时进行过滤,并且还会在GTLT个案例中进行抵消。我会说这个解决方案非常好,因为没有必要使用标准函数来实现它。

答案 3 :(得分:3)

编辑:使用Data.These中的These a b类型(由Data.Align使用),其中包含:

ordzipBy :: (Ord t) => (a -> t) -> (b -> t) -> [a] -> [b] -> [These a b]
ordzipBy f g a@(x:t) b@(y:r) = case compare (f x) (g y) of
    LT -> This  x   : ordzipBy f g t b
    GT -> That    y : ordzipBy f g a r
    EQ -> These x y : ordzipBy f g t r
ordzipBy _ _ a       []      = map This a
ordzipBy _ _ []      b       = map That b

我们可以表达三组操作:

diffBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [a]
meetBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [(a, b)]
joinBy :: (Ord t) => (a -> t) -> (a->a->a) -> [a] -> [a] -> [a]

diffBy f g xs ys = [x     | This  x   <- ordzipBy f g xs ys]
meetBy f g xs ys = [(x,y) | These x y <- ordzipBy f g xs ys]
joinBy f h xs ys = mergeThese h   `map`  ordzipBy f f xs ys

你描述的是meetBy,即设置交叉操作,两个有序列表被视为集合。

编译器有效编译这些定义的能力是另一个问题。沿着ordzipBy行手动编码的三组函数可能运行得更快。

ordzipBy f galign兼容,[]nil兼容,但实现此类活动所涉及的类型机制高于我的工资等级。 :)此外,我不清楚法律align (f <$> xs) (g <$> ys) = bimap f g <$> align xs ys是否有意义,因为映射函数fg可以很好地改变{{1}元素的相互排序}和xs

这两个问题(类型和法律)是相关的:选择器函数为了排序目的而恢复的数据部分用作位置,作为 shape ,是原始数据的一部分。 (参见instance Alternative ZipList in Haskell?)。


更新:查看以下内容是否符合预期。

ys

无限列表处理不好,而手工编码的功能显然可以很容易地用于handle such cases correctly

{-# LANGUAGE InstanceSigs, DatatypeContexts #-}
import Data.These
import Data.Align

newtype Ord a => ZL a b = ZL {unzl :: [(a,b)]}
      deriving (Eq, Show)

instance Ord a => Functor (ZL a) where
  fmap f (ZL xs) = ZL [(k, f v) | (k,v)<-xs]

instance Ord a => Align (ZL a) where 
  nil = ZL []
  align :: (ZL a b) -> (ZL a c) -> (ZL a (These b c))
  align (ZL a) (ZL b) = ZL (g a b) where
    g a@((k,x):t) b@((n,y):r) = case compare k n of
        LT -> (k, This  x  ) : g t b 
        GT -> (n, That    y) : g a r
        EQ -> (k, These x y) : g t r  
    g a               []              = [(k, This x) | (k,x) <- a]
    g []              b               = [(n, That y) | (n,y) <- b]

diffBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [a]
meetBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [(a, b)]
joinBy :: (Ord t) => (a -> t) -> (a->a->a) -> [a] -> [a] -> [a]

diffBy f g xs ys = catThis        . map snd . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(g y,y) | y<-ys])
meetBy f g xs ys = catThese       . map snd . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(g y,y) | y<-ys])
joinBy f h xs ys = map (mergeThese h . snd) . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(f y,y) | y<-ys])