Question

我遇到了以下行为：

ghci :m +Data.List

ghci> groupBy (\x y -> succ x == y) [1..6]
[[1,2], [3,4], [5,6]]
ghci> groupBy (\x y -> succ x /= y) [1..6]
[[1], [2], [3], [4], [5], [6]]

ghci :m +Data.Char   -- just a test to verify that nothing is broken with my ghc

ghci> groupBy (const isAlphaNum) "split this"
["split"," this"]

让我感到惊讶，我认为，基于下面的示例，groupBy每当谓词评估为True时，为提供给谓词的两个连续元素拆分列表。但是在上面的第二个例子中，它将每个元素上的列表拆分，但谓词应该计算为False。我想到了它在Haskell中是如何工作的假设，所以每个人都理解我认为它是如何工作的：

groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy p l@(x:y:ys)
  | p x y     = (x:y:(head l)) ++ (tail l)       -- assuming l has a tail, unwilling to
  | otherwise = [x] ++ (y:(head l)) ++ (tail l)  -- to debug this right now, I guess you
groupBy _ [x] = [[x]]                            -- already got the hang of it ;)

这让我得出结论，它有些不同。所以我的问题是，这个功能实际上是如何运作的？

Answer 1

但是在我上面的第二个例子中，它会在每个元素上拆分列表，但谓词应该评估为False。

在第二个例子中，也评估每两个连续的元素。它的作用函数是const isAlphaNum。这意味着类型是：

const isAlphaNum :: b -> Char -> Bool

因此使用组的开头和元素调用函数，但仅考虑第二个元素。

因此，如果我们使用：groupBy (const isAlphaNum) "split this"调用它，它将评估：

succs   2nd    const isAlphaNum
-------------------------------
"sp"    'p'    True
"sl"    'l'    True
"si"    'i'    True
"st"    't'    True
"s "    ' '    False
" t"    't'    True
" h"    'h'    True
" i"    'i'    True
" s"    's'    True

每次const isAlphaNum为True时，它都会将字符附加到当前序列。因此，如果我们评估"t "，const isAlphaNum，它将评估为False，groupBy将开始一个新组。

因此，我们在这里构建了两个组，因为只有一个False。

如果我们分析函数source code：

，我们也可以获得这个结果

groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []     =  []
groupBy eq (x:xs) =  (x:ys) : groupBy eq zs
    where (ys,zs) = span (eq x) xs

因此，如果给定列表为空，groupBy将返回空列表。如果它不是空列表(x:xs)，那么我们将构造一个新序列。序列以x开头，并且还包含ys的所有元素。 ys是由span构建的2元组的第一个元素。

span :: (a -> Bool) -> [a] -> ([a],[a])构造一个2元组，其中第一个元素是满足谓词的列表的最长可能前缀，这里的谓词是eq x所以我们不断向组添加元素，只要作为eq x y（y元素）成立。

使用列表ys的剩余部分，我们构造一个新组，直到输入完全耗尽。

Haskell groupBy函数：它究竟是如何工作的？

1 个答案: