Question

如果经常需要压缩两个列表，则丢弃不匹配的元素（通过比较列表元素的部分来定义“匹配”）。例如：

let as = [(1,"a"), (2,"b"), (4,"c")]
let bs = [(2,"a"), (3,"b"), (4,"c"), (5, "d")]
zipWithAdjust (fst) (fst) as bs
-- ~> [((2,"b"),(2,"a")), ((4,"c"),(4,"c"))]

我按如下方式实施了zipWithAdjust：

zipWithAdjust :: (Ord c, Show a, Show b, Show c) => (a -> c) -> (b -> c) -> [a] -> [b] -> [(a,b)]
zipWithAdjust cmpValA cmpValB (a:as) (b:bs)
  | cmpValA a == cmpValB b = (a,b) : zipWithAdjust cmpValA cmpValB as bs
  | cmpValA a > cmpValB b = zipWithAdjust cmpValA cmpValB (a:as) bs
  | cmpValA a < cmpValB b = zipWithAdjust cmpValA cmpValB as (b:bs)
zipWithAdjust _ _ _ _ = []

它工作正常，但我觉得有一种标准的方法来做这种拉链。我找到了Data.Align和this SO Question，但无法弄清楚如何将它用于我的用例。

有没有标准的方法（使用库函数）？是Data.Align吗？如果是这样，我如何使用Data.Align实现上述功能？

修改：更改了<案例以与实例匹配实现。

Answer 1

据我所知，没有such function。但是，您可以使用(a -> b -> Ordering)而不是两个附加功能来使您的功能更加通用：

zipWithAdjust :: (a -> b -> Ordering) -> [a] -> [b] -> [(a,b)]
zipWithAdjust cmp (a:as) (b:bs)
  | ord == LT = zipWithAdjust cmp as (b:bs)
  | ord == GT = zipWithAdjust cmp (a:as) (bs)
  | ord == EQ = (a,b) : zipWithAdjust cmp as bs
  where ord = cmp a b

zipWithAdjust _ _ _ = []

result =  zipWithAdjust (\x y -> compare (fst x) (fst y)) [(1,"a"), (2,"b"), (4,"c")] [(2,"a"), (3,"b"), (4,"c"), (5, "d")]

但是，我不再称之为zip，而是compareMerge或类似名称。

Answer 2

您可能会喜欢Data.Map's intersection capabilities。它在某些方面的能力稍差，而在其他方面则更有能力。例如：

> let as = fromList [(1,"a"), (2,"b"), (4,"c")]; bs = fromList [(2,"a"), (3,"b"), (4,"c"), (5, "d")]
> intersectionWith (,) as bs
fromList [(2,("b","a")),(4,("c","c"))]

Answer 3

我想说这是更惯用的说法

zipWithAdjust cmpA cmpB (a:as) (b:bs) =
    case cmpA a `compare` cmpB b of
        EQ -> (a, b) : zipWithAdjust cmpA cmpB    as     bs
        GT ->          zipWithAdjust cmpA cmpB (a:as)    bs
        LT ->          zipWithAdjust cmpA cmpB    as  (b:bs)
zipWithAdjust _ _ _ _ = []

肯定会更快，因为它会减少您计算cmpA a和cmpB b的次数。这不是真正的拉链，因为您同时进行过滤，并且还会在GT和LT个案例中进行抵消。我会说这个解决方案非常好，因为没有必要使用标准函数来实现它。

Answer 4

编辑：使用Data.These中的These a b类型（由Data.Align使用），其中包含：

ordzipBy :: (Ord t) => (a -> t) -> (b -> t) -> [a] -> [b] -> [These a b]
ordzipBy f g a@(x:t) b@(y:r) = case compare (f x) (g y) of
    LT -> This  x   : ordzipBy f g t b
    GT -> That    y : ordzipBy f g a r
    EQ -> These x y : ordzipBy f g t r
ordzipBy _ _ a       []      = map This a
ordzipBy _ _ []      b       = map That b

我们可以表达三组操作：

diffBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [a]
meetBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [(a, b)]
joinBy :: (Ord t) => (a -> t) -> (a->a->a) -> [a] -> [a] -> [a]

diffBy f g xs ys = [x     | This  x   <- ordzipBy f g xs ys]
meetBy f g xs ys = [(x,y) | These x y <- ordzipBy f g xs ys]
joinBy f h xs ys = mergeThese h   `map`  ordzipBy f f xs ys

你描述的是meetBy，即设置交叉操作，两个有序列表被视为集合。

编译器有效编译这些定义的能力是另一个问题。沿着ordzipBy行手动编码的三组函数可能运行得更快。

ordzipBy f g与align兼容，[]与nil兼容，但实现此类活动所涉及的类型机制高于我的工资等级。 :)此外，我不清楚法律align (f <$> xs) (g <$> ys) = bimap f g <$> align xs ys是否有意义，因为映射函数f和g可以很好地改变{{1}元素的相互排序}和xs。

这两个问题（类型和法律）是相关的：选择器函数为了排序目的而恢复的数据部分用作位置，作为 shape ，是原始数据的一部分。（参见instance Alternative ZipList in Haskell?）。

更新：查看以下内容是否符合预期。

ys

无限列表处理不好，而手工编码的功能显然可以很容易地用于handle such cases correctly：

{-# LANGUAGE InstanceSigs, DatatypeContexts #-}
import Data.These
import Data.Align

newtype Ord a => ZL a b = ZL {unzl :: [(a,b)]}
      deriving (Eq, Show)

instance Ord a => Functor (ZL a) where
  fmap f (ZL xs) = ZL [(k, f v) | (k,v)<-xs]

instance Ord a => Align (ZL a) where 
  nil = ZL []
  align :: (ZL a b) -> (ZL a c) -> (ZL a (These b c))
  align (ZL a) (ZL b) = ZL (g a b) where
    g a@((k,x):t) b@((n,y):r) = case compare k n of
        LT -> (k, This  x  ) : g t b 
        GT -> (n, That    y) : g a r
        EQ -> (k, These x y) : g t r  
    g a               []              = [(k, This x) | (k,x) <- a]
    g []              b               = [(n, That y) | (n,y) <- b]

diffBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [a]
meetBy :: (Ord t) => (a -> t) -> (b -> t)  -> [a] -> [b] -> [(a, b)]
joinBy :: (Ord t) => (a -> t) -> (a->a->a) -> [a] -> [a] -> [a]

diffBy f g xs ys = catThis        . map snd . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(g y,y) | y<-ys])
meetBy f g xs ys = catThese       . map snd . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(g y,y) | y<-ys])
joinBy f h xs ys = map (mergeThese h . snd) . unzl 
                    $ align (ZL [(f x,x) | x<-xs]) (ZL [(f y,y) | y<-ys])

使用自定义比较器函数压缩两个列表的标准方法

4 个答案: