Why is the minimalist, example Haskell quicksort not a "true" quicksort?

Haskell's website introduces a very attractive 5-line quicksort function, as seen below.

quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
        lesser = filter (< p) xs
        greater = filter (>= p) xs

They also include a "True quicksort in C".

// To sort array a[] of size n: qsort(a,0,n-1)

void qsort(int a[], int lo, int hi) 
  int h, l, p, t;

  if (lo < hi) {
    l = lo;
    h = hi;
    p = a[hi];

    do {
      while ((l < h) && (a[l] <= p)) 
          l = l+1;
      while ((h > l) && (a[h] >= p))
          h = h-1;
      if (l < h) {
          t = a[l];
          a[l] = a[h];
          a[h] = t;
    } while (l < h);

    a[hi] = a[l];
    a[l] = p;

    qsort( a, lo, l-1 );
    qsort( a, l+1, hi );

A link below the C version directs to a page that states 'The quicksort quoted in Introduction isn't the "real" quicksort and doesn't scale for longer lists like the c code does.'

Why is the above Haskell function not a true quicksort? How does it fail to scale for longer lists?


The true quicksort has two beautiful aspects:

  1. Divide and conquer: break the problem into two smaller problems.
  2. Partition the elements in-place.

The short Haskell example demonstrates (1), but not (2). How (2) is done may not be obvious if you don't already know the technique!

True inplace quicksort in Haskell:

import qualified Data.Vector.Generic as V 
import qualified Data.Vector.Generic.Mutable as M 

qsort :: (V.Vector v a, Ord a) => v a -> v a
qsort = V.modify go where
    go xs | M.length xs < 2 = return ()
          | otherwise = do
            p <- M.read xs (M.length xs `div` 2)
            j <- M.unstablePartition (< p) xs
            let (l, pr) = M.splitAt j xs 
            k <- M.unstablePartition (== p) pr
            go l; go $ M.drop k pr

Here is a transliteration of the "true" quicksort C code into Haskell. Brace yourself.

import Control.Monad
import Data.Array.IO
import Data.IORef

qsort :: IOUArray Int Int -> Int -> Int -> IO ()
qsort a lo hi = do
  (h,l,p,t) <- liftM4 (,,,) z z z z

  when (lo < hi) $ do
    l .= lo
    h .= hi
    p .=. (a!hi)

    doWhile (get l .< get h) $ do
      while ((get l .< get h) .&& ((a.!l) .<= get p)) $ do
        modifyIORef l succ
      while ((get h .> get l) .&& ((a.!h) .>= get p)) $ do
        modifyIORef h pred
      b <- get l .< get h
      when b $ do
        t .=. (a.!l)
        lVal <- get l
        hVal <- get h
        writeArray a lVal =<< a!hVal
        writeArray a hVal =<< get t

    lVal <- get l
    writeArray a hi =<< a!lVal
    writeArray a lVal =<< get p

    hi' <- fmap pred (get l)
    qsort a lo hi'
    lo' <- fmap succ (get l)
    qsort a lo' hi

That was fun, wasn't it? I actually cut out this large let at the beginning, as well as the where at the end of the function, defining all of the helpers to make the preceding code somewhat pretty.

  let z :: IO (IORef Int)
      z = newIORef 0
      (.=) = writeIORef
      ref .=. action = do v <- action; ref .= v
      (!) = readArray
      (.!) a ref = readArray a =<< get ref
      get = readIORef
      (.<) = liftM2 (<)
      (.>) = liftM2 (>)
      (.<=) = liftM2 (<=)
      (.>=) = liftM2 (>=)
      (.&&) = liftM2 (&&)
  -- ...
  where doWhile cond foo = do
          b <- cond
          when b $ doWhile cond foo
        while cond foo = do
          b <- cond
          when b $ foo >> while cond foo

And here, a dumb test to see if it works.

main = do
    a <- (newListArray (0,9) [10,9..1]) :: IO (IOUArray Int Int)
    printArr a
    putStrLn "Sorting..."
    qsort a 0 9
    putStrLn "Sorted."
    printArr a
  where printArr a = mapM_ (\x -> print =<< readArray a x) [0..9]

I don't write imperative code very often in Haskell, so I'm sure there are plenty of ways to clean this code up.

So what?

You will notice that the above code is very, very long. The heart of it is about as long as the C code, though each line is often a bit more verbose. This is because C secretly does a lot of nasty things that you might take for granted. For example, a[l] = a[h];. This accesses the mutable variables l and h, and then accesses the mutable array a, and then mutates the mutable array a. Holy mutation, batman! In Haskell, mutation and accessing mutable variables is explicit. The "fake" qsort is attractive for various reasons, but chief among them is it does not use mutation; this self-imposed restriction makes it much easier to understand at a glance.

In my opinion, saying that it's "not a true quicksort" overstates the case. I think it's a valid implementation of the Quicksort algorithm, just not a particularly efficient one.

I think the case this argument tries to make is that the reason why quicksort is commonly used is that it's in-place and fairly cache-friendly as a result. Since you don't have those benefits with Haskell lists, its main raison d'ĂȘtre is gone, and you might as well use merge sort, which guarantees O(n log n), whereas with quicksort you either have to use randomization or complicated partitioning schemes to avoid O(n2) run time in the worst case.

Thanks to lazy evaluation, a Haskell program doesn't (almost can't) do what it looks like it does.

Consider this program:

main = putStrLn (show (quicksort [8, 6, 7, 5, 3, 0, 9]))

In an eager language, first quicksort would run, then show, then putStrLn. A function's arguments are computed before that function starts running.

In Haskell, it's the opposite. The function starts running first. The arguments are only computed when the function actually uses them. And a compound argument, like a list, is computed one piece at a time, as each piece of it is used.

So the first thing that happens in this program is that putStrLn starts running.

GHC's implementation of putStrLn works by copying the characters of the argument String into to an output buffer. But when it enters this loop, show has not run yet. Therefore, when it goes to copy the first character from the string, Haskell evaluates the fraction of the show and quicksort calls needed to compute that character. Then putStrLn moves on to the next character. So the execution of all three functions—putStrLn, show, and quicksort— is interleaved. quicksort executes incrementally, leaving a graph of unevaluated thunks as it goes to remember where it left off.

Now this is wildly different from what you might expect if you're familiar with, you know, any other programming language ever. It's not easy to visualize how quicksort actually behaves in Haskell in terms of memory accesses or even the order of comparisons. If you could only observe the behavior, and not the source code, you would not recognize what it's doing as a quicksort.

For example, the C version of quicksort partitions all the data before the first recursive call. In the Haskell version, the first element of the result will be computed (and could even appear on your screen) before the first partition is finished running—indeed before any work at all is done on greater.

P.S. The Haskell code would be more quicksort-like if it did the same number of comparisons as quicksort; the code as written does twice as many comparisons because lesser and greater are specified to be computed independently, doing two linear scans through the list. Of course it's possible in principle for the compiler to be smart enough to eliminate the extra comparisons; or the code could be changed to use Data.List.partition.

P.P.S. The classic example of Haskell algorithms turning out not to behave how you expected is the sieve of Eratosthenes for computing primes.

I believe that the reason most people say that the pretty Haskell Quicksort isn't a "true" Quicksort is the fact that it isn't in-place - clearly, it can't be when using immutable datatypes. But there is also the objection that it isn't "quick": partly because of the expensive ++, and also because there is a space leak - you hang on to the input list while doing the recursive call on the lesser elements, and in some cases - eg when the list is decreasing - this results in quadratic space usage. (You might say that making it run in linear space is the closest you can get to "in-place" using immutable data.) There are neat solutions to both problems, using accumulating parameters, tupling, and fusion; see S7.6.1 of Richard Bird's Introduction to Functional Programming Using Haskell.

There is no clear definition of what is and what isn't a true quicksort.

They are calling it not a true quicksort, because it doesn't sort in-place:

True quicksort in C sorts in-place

It isn't the idea of mutating elements in-place in purely functional settings. The alternative methods in this thread with mutable arrays lost the spirit of purity.

There are at least two steps to optimize the basic version (which is the most expressive version) of quick-sort.

  1. Optimize the concatenation (++), which is a linear operation, by accumulators:

    qsort xs = qsort' xs []
    qsort' [] r = r
    qsort' [x] r = x:r
    qsort' (x:xs) r = qpart xs [] [] r where
        qpart [] as bs r = qsort' as (x:qsort' bs r)
        qpart (x':xs') as bs r | x' <= x = qpart xs' (x':as) bs r
                               | x' >  x = qpart xs' as (x':bs) r
  2. Optimize to ternary quick sort (3-way partition, mentioned by Bentley and Sedgewick), to handle duplicated elements:

    tsort :: (Ord a) => [a] -> [a]
    tsort [] = []
    tsort (x:xs) = tsort [a | a<-xs, a<x] ++ x:[b | b<-xs, b==x] ++ tsort [c | c<-xs, c>x]
  3. Combine 2 and 3, refer to Richard Bird's book:

    psort xs = concat $ pass xs []
    pass [] xss = xss
    pass (x:xs) xss = step xs [] [x] [] xss where
        step [] as bs cs xss = pass as (bs:pass cs xss)
        step (x':xs') as bs cs xss | x' <  x = step xs' (x':as) bs cs xss
                                   | x' == x = step xs' as (x':bs) cs xss
                                   | x' >  x = step xs' as bs (x':cs) xss

Or alternatively if the duplicated elements are not the majority:

    tqsort xs = tqsort' xs []

    tqsort' []     r = r
    tqsort' (x:xs) r = qpart xs [] [x] [] r where
        qpart [] as bs cs r = tqsort' as (bs ++ tqsort' cs r)
        qpart (x':xs') as bs cs r | x' <  x = qpart xs' (x':as) bs cs r
                                  | x' == x = qpart xs' as (x':bs) cs r
                                  | x' >  x = qpart xs' as bs (x':cs) r

Unfortunately, median-of-three can't be implemented with the same effect, for example:

    qsort [] = []
    qsort [x] = [x]
    qsort [x, y] = [min x y, max x y]
    qsort (x:y:z:rest) = qsort (filter (< m) (s:rest)) ++ [m] ++ qsort (filter (>= m) (l:rest)) where
        xs = [x, y, z]
        [s, m, l] = [minimum xs, median xs, maximum xs] 

because it still performs poorly for the following 4 cases:

  1. [1, 2, 3, 4, ...., n]

  2. [n, n-1, n-2, ..., 1]

  3. [m-1, m-2, ...3, 2, 1, m+1, m+2, ..., n]

  4. [n, 1, n-1, 2, ... ]

All these 4 cases are well handled by imperative median-of-three approach.

Actually, the most suitable sort algorithm for a purely functional setting is still merge-sort, but not quick-sort.

For detail, please visit my ongoing writing at: https://sites.google.com/site/algoxy/dcsort

Ask anybody to write quicksort in Haskell, and you will get essentially the same program--it is obviously quicksort. Here are some advantages and disadvantages:

Pro: It improves on "true" quicksort by being stable, i.e. it preserves sequence order among equal elements.

Pro: It is trivial to generalize to a three-way split (< = >), which avoids quadratic behavior due to some value occurring O(n) times.

Pro: It's easier to read--even if one had to include the definition of filter.

Con: It uses more memory.

Con: It is costly to generalize the pivot choice by further sampling, which could avoid quadratic behavior on certain low-entropy orderings.

Because taking the first element from the list results in very bad runtime. Use median of 3: first, middle, last.

Need Your Help