Project Euler #14 revisited

Recently I came a cross a link from... aargh, was it Reddit or Hacker News? I should have saved the link. In any case, it was from someone who had decided to try to optimize a solution to Project Euler #14 in Haskell, and it induced me to revisit the program I'd written for it. Euler #14 asks the musical question, which number less than a million takes the longest for its Collatz sequence to get to 1?

{-# LANGUAGE BangPatterns #-}

import Data.List
import Data.Ord

collatzChainLen :: Int -> Int
collatzChainLen n = collatzChainLen' n 1
    where collatzChainLen' n !l
            | n == 1    = l
            | otherwise = collatzChainLen' (collatz n) (l + 1)
          collatz n = if even n then n `div` 2 else 3 * n + 1

pairMap :: (a -> b) -> [a] -> [(a, b)]
pairMap f xs = [(x, f x) | x <- xs]

main :: IO ()
main = print $ fst (maximumBy (comparing snd) (pairMap collatzChainLen [1..999999]))

Compiled with ghc -O2, about 3.5 seconds on my 2.8 GHz Athlon 64.
Of course, the thing to do is memoize collatzChainLen; if you have a sequence {x1, x2, ..., xn = 1, ...} you know the values for all the xi, not just x1.

But the first thing I did was say to myself, "Self, that collatzChainLen function is yet another example of grinding out yet another boilerplate recursive function rather than using the Prelude and getting in the groove of using what the language gives you, combining functisns as Mr. Hughes and Mr. Backus recommend. Surely we can be more Haskelly than that." (I swear I hadn't been watching Looney Tunes.)

The result:

collatzChainLen :: Int -> Int
collatzChainLen n = 1 + (length . takeWhile (/= 1) . collatzChain) n
    where collatzChain = iterate collatz
          collatz n = if even n then n `div` 2 else 3 * n + 1

We've made the acquaintance of iterate before; it generates as much of the endless chain as we wish to grab, which we wish to do until we find a 1 with takewhile (/= 1). length counts how many that is, and while it doesn't affect the comparison, we add 1 rather than fib about what we're calculating. One add shouldn't affect much (or should it? If we're not memoizing, we're doing it rather a lot).

Does that affect the time? It sure does, but not as we'd like. With that version of collatzChainLen, the program takes about 9.7 seconds to run, almost three times as long! What's the difference? "It's profilin' time", as Ben Grimm would say if he learned Haskell. The evidence is pretty clear:

first version:   total alloc = 192,049,472 bytes
second version: total alloc = 21,277,532,952 bytes

Over a hundred times as much RAM, and who knows how much GC? I could believe that's the issue.

The lower-level version of collatzChainLen generates and counts values one at a time... but then, Haskell is lazy, so the higher-level one should, too--but perhaps something about creating a list of those values, or rather a couple, since takeWhile can't destructively chop off the list, but has to copy, is increasing the overhead. I recall that there's supposed to be something called "stream fusion" that gets rid of that overhead; let's look up how that's done.

Oops. seems to be down, but apt-get install is my friend. Looks like there's a Data.Stream library, and you create streams instead of lists. Here goes:

import qualified Data.Stream as S

collatzChainLen :: Int -> Int
collatzChainLen n = 1 + (length . S.takeWhile (/= 1) . collatzChain) n
    where collatzChain = S.iterate collatz
          collatz n = if even n then n `div` 2 else 3 * n + 1

Time? Drat; still around 9.7 seconds. Doesn't seem very fused to me with that time, and a blog post about this very problem had example code that had

import qualified Data.List.Stream as S

and used S.length instead of just length. OK, rummage through synaptic for the libghc packages (sorry, but I've been burned by cabal), and...??? Darned if I can find it, and the blog post dates from August 2013, so it's not all that old.

While we're trying to figure out how to get to Data.List.Stream in Ubuntu, we can do one other thing:

collatz n = case n `divMod` 2 of
                (n', 0) -> n'
                _       -> 3 * n + 1

turns out to make a very respectable difference, cutting runtime down by 1.2 seconds! More news as it happens.

UPDATE: clearly

colllatzChainLen (2 * n) == 1 + collatzChainLen n

so, given a number < 500000, you could double it until it was in [500000, 999999] and have a number with a longer chain, so we can start with 500000.

Results: 4.6 seconds for the "Haskelly" version, 1.2 seconds for the explicitly written out collatzChainLen.

UPDATE: -fllvm is your friend; it took the runtimes down to 3.5 seconds for the higher-order function flavor... and 0.29 seconds for the version with the written-out collatzChainLen.


Popular posts from this blog

a longest path problem

No tutorial, I swear...

Bikeshedding leap years