Question

给定一个列表，我想使用“边界函数”将其分成簇。这样的函数将占用列表中的两个连续元素，并决定它们是否应该属于同一个集群。

基本上，我想要这样的东西：

clusterBy :: (a -> a -> Bool) -> [a] -> [[a]]

ghci> farAway x y = abs (x - y) > 10
ghci> clusterBy farAway [1, 4, 18, 23, 1, 17, 21, 12, 30, 39, 48]
[[1, 4], [18, 23], [1], [17, 21, 12], [30, 39, 48]]

在Hoogle中查看此类型声明仅产生groupBy这不是我需要的（它不会考虑元素的顺序）。

是否有任何库函数可以执行类似的操作？或者可选择：如何干净地实现它，即不使用尾递归循环？

编辑：作为参考，我设法做的实施如下：

clusterBy :: (a -> a -> Bool) -> [a] -> [[a]]
clusterBy isBoundary list = 
loop list [] []
    where
        loop (first:rest) [] result = loop rest [first] result
        loop list@(next:rest) cluster result =
            if isBoundary (head cluster) next then
                loop list [] ((reverse cluster):result)
            else
                loop rest (next:cluster) result
        loop [] cluster@(_:_) result = cluster:result
        loop _ _ result = result

Answer 1

这是你想要的吗？

clusterBy :: (a -> a -> Bool) -> [a] -> [[a]]
clusterBy isDifferentCluster = foldr f []
  where f x (ys @ (y : _) : yss) | not (isDifferentCluster x y) = (x : ys) : yss
        f x                 yss                                 = [x]      : yss

函数参数的意义应该可以颠倒（即，当它的参数应该在同一个集群中时返回true，否则返回false），但我已经按照你要求的形式给出了它。

使用边界函数对列表进行聚类

1 个答案: