Question

这是一种快速排序的实现，被认为是“不是真正的快速”：

qSort :: Ord a => [a] -> [a]
qSort [] = []
qSort (p:xs) = (qSort $ filter (< p) xs) ++ (qSort $ filter (== p) xs) ++ [p] ++ (qSort $ filter (> p) xs)

main = print $ qSort [15, 11, 9, 25, -3]

但是，是否有可能计算完成它所需的比较次数？我试着计算filter (< p) xs和filter (> p) xs的大小，但事实证明这不是我需要的。

更新：

问题不在于时间复杂性，而在于计算比较次数。

Answer 1

正如其他人所说，直接翻译是修改你的算法以使用monad，这将计算比较。而不是State我宁愿使用Writer，因为它更自然地描述了正在发生的事情：每个结果都增加了它所需的（加法）数量的比较：

import Control.Applicative
import Control.Monad
import Control.Monad.Writer.Strict
import Data.Monoid

type CountM a = Writer (Sum Int) a

然后让我们定义一个函数，它将纯值包装成一个增加计数器的monadic值：

count :: Int -> a -> CountM a
count c = (<$ tell (Sum c))

现在我们可以定义

qSortM :: Ord a => [a] -> CountM [a]
qSortM [] = return []
qSortM (p:xs) =
   concatM [ qSortM =<< filtM (<  p)
           , filtM (== p)
           , return [p]
           , qSortM =<< filtM (>  p)
           ]
  where
    filtM p = filterM (count 1 . p) xs
    concatM :: (Monad m) => [m [a]] -> m [a]
    concatM = liftM concat . sequence

它不如原始版本好，但仍可使用。

请注意，您要比较列表中的每个元素三次，而这足以执行一次。这有另一个不幸的后果，原始列表必须保留在内存中，直到所有三个过滤器完成。所以我们改为定义

-- We don't care we reverse the order of elements in the buckets,
-- we'll sort them later anyway.
split :: (Ord a) => a -> [a] -> ([a], [a], [a], Int)
split p = foldl f ([], [], [], 0)
  where
    f (ls, es, gs, c) x = case compare x p of
        LT  -> (x : ls, es, gs, c')
        EQ  -> (ls, x : es, gs, c')
        GT  -> (ls, es, x : gs, c')
      where
        c' = c `seq` c + 1

这会同时执行拆分到三个存储桶中并计算列表的长度，因此我们可以立即更新计数器。该列表立即被消耗，可以被垃圾收集器丢弃。

现在我们的快速排序会更精简

qSortM :: Ord a => [a] -> CountM [a]
qSortM [] = return []
qSortM (p:xs) = count c =<<
    concatM [ qSortM ls
            , return (p : es)
            , qSortM gs
            ]
  where
    (ls, es, gs, c) = split p xs
    concatM = liftM concat . sequence

我们可以在不使用Writer的情况下达到相同的结果，只需让qSortM明确返回(Int, [a])即可。但是我们必须手动处理递归qSortM的结果，这会更加混乱。此外，monadic方式允许我们以后添加更多信息，例如最大深度，而不会以任何方式破坏核心部分。

Answer 2

从理论上思考这个问题，正如Monad Newb在答案中提出的那样，可能是思考算法实际做法的最佳方式。

当然，将比较强加于State monad是愚蠢的方式。这将只是对比较函数的每次调用进行暴力计数。请注意使用count操作，该操作只是将谓词转换为跟踪所述谓词的每次调用的操作，然后将其应用于其参数。

{-# LANGUAGE ScopedTypeVariables #-}
import Control.Monad.State

qSortM :: Ord a => [a] -> State Int [a]
qSortM [] = return []
qSortM (p:xs) = do h <- (qSortM =<< filterM (count (< p)) xs)
                   e <- (qSortM =<< filterM (count (== p)) xs)
                   t <- (qSortM =<< filterM (count (> p)) xs)
                   return $ h ++ e ++ [p] ++ t
               where count :: (a -> Bool) -> (a -> State Int Bool)
                     count p a = modify (+1) >> return (p a)

qSort :: Ord a => [a] -> ([a],Int)
qSort l = runState (qSortM l) 0

main :: IO ()
main = print $ (qSort [15, 11, 9, 25, -3])

这实际上是可怕的Haskell，只需使用递归函数就可以在没有State monad的情况下表达。一个好的练习就是这样写。不可否认，State monad版本对于来自命令背景的人来说会更直观。

> qSort [10,9..1]
([1,2,3,4,5,6,7,8,9,10],90)

> qSort [15,11,9,25,-3]
([-3,9,11,15,25],14)

Answer 3

让我们评估表达式qSort [15, 11, 9, 25, -3]：

qSort [15, 11, 9, 25, -3]
    = qSort (15:[11, 9, 25, -3])
    = (qSort $ filter (< 15) [11, 9, 25, -3])
      ++ (qSort $ filter (== 15) [11, 9, 25, -3])
      ++ [p]
      ++ (qSort $ filter (> 15) [11, 9, 25, -3])
    = (qSort [11, 9, -3]) ++ (qSort []) ++ [p] ++ (qSort [25])

为了达到这个目的，我们进行了12次比较。您可以用类似的方式评估qSort的剩余应用程序。

请注意，这种替换是有效的，因为我们在这里使用所谓的纯函数式编程。没有副作用，因此任何表达式都可以用等效的表达式代替。

Answer 4

我们可以编写函数来计算这个数字。每个filter p xs执行length xs比较，这就是全部。

现在你的代码执行3次传递来执行分区;您可以重写它以在一次通过中执行三向分区。我们可以使传递次数成为一个参数。

另一件事是对中间部分进行不必要的分类，已知中间部分包含所有相同的元素。我们也可以将它作为一个参数来表示是否进行了这种不必要的排序。

_NoSort = False
_DoSort = True

qsCount xs n midSortP = qs 0 xs id     -- return number of comparisons that 
 where                                 -- n-pass `qSort xs` would perform
   qs !i []     k = k i
   qs !i (p:xs) k = 
     let (a,b,c) = (filter (< p) xs, filter (== p) xs, filter (> p) xs)
     in qs (i+n*length xs) a (\i-> g i b (\i-> qs i c k))
   g i b k | midSortP  = qs i b k 
           | otherwise = k i

可以看出，与3次传球相比，3次传球需要多3倍的比较，如果列表中有两个以上相等的元素，则中间排序只能有所不同：

*Main> qsCount ( concat $ replicate 4 [10,9..1]) 3 _NoSort
630
*Main> qsCount ( concat $ replicate 4 [10,9..1]) 3 _DoSort
720
*Main> qsCount ( concat $ replicate 4 [10,9..1]) 1 _NoSort
210
*Main> qsCount ( concat $ replicate 4 [10,9..1]) 1 _DoSort
240
*Main> qsCount [5,3,8,4,10,1,6,2,7,9] 1 _NoSort
19
*Main> qsCount [5,3,8,4,10,1,6,2,7,9] 1 _DoSort
19
*Main> qsCount (replicate 10 1) 1 _NoSort
9
*Main> qsCount (replicate 10 1) 1 _DoSort
45
*Main> qsCount [15, 11, 9, 25, -3] 3 _DoSort
21

快速排序的比较次数

4 个答案: