Saturday, June 22, 2013

In the great tradition of nibbling at the edges...

...and because I am still puzzling over why pairsum is grabbing so much RAM, let's contemplate floorSqrt. We're just calling it 2,000 times, and yet it's collectively taking up over six percent of the CPU time and five percent of allocation?!

Let's remind ourselves of floorSqrt (tweaked because I forgot that I'd memoized powers of two)

floorSqrt :: Integer -> Integer

floorSqrt 0 = 0
floorSqrt n = floorSqrt' (twoToThe ((1 + bitsIn n) `div` 2))
    where floorSqrt' x =
              let y = (x + n `div` x) `div` 2
              in if y >= x then x else floorSqrt' y

This is a special case of the Newton-Raphson method that was known to the ancient Greeks. We treat zero specially to avoid one of the gotchas of Newton-Raphson, but aside from that, this particular example is well-behaved, and has what they call quadratic convergence: once you're sufficiently close, each successive guess is good to twice as many digits as the one before. Our first guess is good to one bit, so the next is good to two bits, then four, then eight, etc. We're dealing with Xs in ranges up to a googol, or around 2330, which would me no more than nine iterations per value. Checking the profiling output, we see a total of 7,175 + 7,472 = 14,647 calls to floorSqrt', which is within our constraint, but darned close to it, so the inputs are tending to the high end of the range.

I kind of hate to cheat, but let's see what happens with

floorSqrt n = floorSqrt' $ (floor. sqrt . fromIntegral) n

[pause to compile, run, check profile output, and check... uh-oh.]

What happens is that it doesn't give the correct results; the output is not correct, which is weird. You'd think that would give you 52 good bits, and thus just need three iterations, but then, look at that termination condition. There has to be something about it that needs that first trial value to make it work, so you can't just plug in another, even better, guess. Taking the first value from the large input, the floorSqrt we've been using returns


while using the supposedly better first guess gives us


We could change the function to take ceil log2 (# bits in n / 52) iterations, but I think I should get back to dealing with memory usage and pairsum. (Even with that change you probably would end up flipping a coin for whether you got the floor or the ceiling of the square root of n.)

Thursday, June 20, 2013

Now that we have better data...

...let's see what's happening.

First, the single sample point faked us out, making us think we were using less RAM than we really were. We are chewing up pretty nearly ten megabytes--still less than that C program, but a bit disappointing.

Or maybe more disappointing than we thought. Here's the output describing heap usage from a run with the options +RTS -sstderr:

     202,392,800 bytes allocated in the heap
      84,601,736 bytes copied during GC
      12,550,176 bytes maximum residency (8 sample(s))
         220,824 bytes maximum slop
              30 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       387 colls,     0 par    0.10s    0.10s     0.0003s    0.0009s
  Gen  1         8 colls,     0 par    0.06s    0.06s     0.0080s    0.0230s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.19s  (  0.20s elapsed)
  GC      time    0.16s  (  0.16s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.36s  (  0.36s elapsed)

  %GC     time      45.2%  (45.3% elapsed)

  Alloc rate    1,028,021,469 bytes per MUT second

  Productivity  54.8% of total user, 55.0% of total elapsed

Whoa. Nearly half our time spent garbage collecting? There's got to be a way to improve that. (And on the graph, nearly two megabytes taken by pairsum?)

Learning to avoid cabal--or not

Well... last night I thought I'd install EclipseFP, a package for Eclipse to support Haskell development. When you fire it up, it goes looking for packages it wants, and apparently uses cabal to install them. It did; I watched it do so for some time.

It turned out to be a waste of time; when I fired up Eclipse (which I'm rather new to) and clicked on the little lambda over to the left, a window opened up that looked half-drawn and very broken. I suspect that was issues with Eclipse--perhaps I should wipe the latest version that I grabbed and installed, and settle for the ancient version that, for some reason, is what Ubuntu has in its repositories.

OK, so I'll pass on an IDE for Haskell for now, or start up with leksah.

This morning, I had one of those sudden realizations that you get that make you laugh at yourself. Why do those memory usage graphs look like pyramids? Because the default sample interval is 0.1 seconds, and I have the run time down around 0.2 seconds, sort of like taking a sample every couple of minutes and expecting to get an accurate playback of a song.

A peek at Real World Haskell and I see the option to change the interval, so I recompile and run... and it claims that I should link with an option to turn on the RTS capabilities.


Another compile or two to make sure I did indeed specify the right options... and then some Googling, because the same thing still happened.

From the Google results, I suspect that all those cabal installs pulled in versions of libraries that don't support profiling, and that ghc is pulling them in. Following people's advice, I rm -rf ~/.ghc. No luck, still can't profile.

I have learned one thing: I am not going to let cabal touch my computer ever again. I would have sworn that I specified that packages should just be installed for me, so the rm should have done the trick. I guess I can look for Haskell libraries dated yesterday and delete every single one of them.

UPDATE: BZZT! Turns out the issue comes from trying to use that -i option to override the heap measurement interval. That's what's giving me the problem. I was wrong... and maybe I'll consider using cabal... sometime when I know Haskell much better than I do now.

UPDATE: Found it... since ghc 6.x, you have to compile with -rtsopts=all to be able to use some (OK, most) RTS options, lest the logging they permit be used to breach security. I will have to try to figure out how -i would give you the opportunity to do so over and above the output that -h*, which doesn't require -rtsopts=all, allows. (OK, maybe it would be possible to allocate a lot of RAM or not every [interval], a lot of RAM means a 1, not much means 0.)


Back in the early days of Unix, when the PDP--11/70's main advantage over the later 8/16-bit 6809 was having separate I/D (instruction and data) space, so that you could have 64K of code with access to 64K of data, the virtue of the Unix Way of small programs that did one thing and did it well was a necessity. One of the things it gave rise to was a separate program, "lint", to check C source code for constructs that might be evidence of a coding error, so that the compiler could concentrate on simply generating code.

Nowadays, C compilers often do some of the checking that was once delegated to lint (though separate lint programs still exist, and are very good--e.g. splint, or Gimpel Software's excellent products).

What about Haskell? Well, for Haskell there's hlint. It will give you advice on the Haskell source that you feed it.

jejones@eeyore:~/src/haskell_play$ hlint ultimatepalindrome14.hs
ultimatepalindrome14.hs:236:13: Warning: Use zipWith
  map showsResult $ zip [1 ..] (map process (tail $ B.lines s))
Why not:
  zipWith (curry showsResult) [1 ..] (map process (tail $ B.lines s))

ultimatepalindrome14.hs:237:5: Error: Use .
  mapM putStr $ map ($ "\n") r
Why not:
  mapM (putStr . ($ "\n")) r

2 suggestions

So there are (at least!) a couple of places I could have written arguably better, easier to read, more idiomatic Haskell. Pretty cool.

What's with that "."? In that context, "." is the function composition operator; given two functions, one of which returns values of the type the other takes as input, composing them gives you a function that applies first one function, then the other. Huh? Better to write it in Haskell:

. :: (a -> b) -> (c -> a) -> (c -> b)

f . g  x = f (g x)

or, equivalently,

f . g = \x -> f (g x)

\ isn't a character escape; it's the closest the creators of Haskell could come to  the Greek letter lambda, as in Alonzo Church's "lambda calculus". A lambda expression is an "anonymous function"; the way to read the right hand side of that last line is "the function that, given an argument x, returns f (g x)".

(Ironically, the lambda was in turn a sort of best approximation; it was inspired by the use of the circumflex in the notation of Russell and Whitehead's Principia Mathematica. Details here.)

Anyway... there's also a Haskell "style scanner" that one can either just get suggestions from or use in the fashion of indent. I will have to check it out.

Wednesday, June 19, 2013

It's not just an idiom...

In this program I've taken a list of values and fed it to

zip [1..]

You'll recall that zip takes two lists and returns a new list as long as the shortest of the lists handed to it. Each element of the new list is a pair of values at corresponding positions in the lists, so that, for example,

zip [1..] "hiya" == [(1, 'h'), (2, 'i'), (3, 'y'), (4, 'a')]

Remind you of anything you learned to do as a kid? Yes, it's constructing an bijection from a set (here represented as a list) to the first however many counting numbers, aka counting. Little did I know back then that I was preparing for Haskell.

Tuesday, June 18, 2013

Further restructuring

It occurred to me that I could push the code even further, making the connection between counting and generation more apparent. Recall the counting code:

numNDigitYs 1 = 3
numNDigitYs n = numTwoTwos n + numOneTwos n + numNoTwos n
    where numTwoTwos n = if even n then 1 else 2
          numOneTwos n = if even n then 0 else n `div` 2
          numNoTwos  n = if even n then s else 2 * s
                         where h = n `div` 2 - 1
                               s = sum [h `choose` i | i <- [0..min h 3]]

Here's the new generation (which sounds like some cheesy 60s thing--sorry!); we've gotten rid of justOnes.

nDigitYs 1 = [1,2,3]
nDigitYs n = sort (noTwos n ++ oneTwos n ++ twoTwos n)
    where halfN  = n `div` 2
          pair i = tenToThe i + tenToThe (n - (i + 1))
          twoTwos n
              | even n    = [twoTwosNoOnes]
              | otherwise = [twoTwosNoOnes, twoTwosNoOnes + tenToThe halfN]
              where twoTwosNoOnes = 2 * tenToThe (n - 1) + 2
          oneTwos n
              | even n    = []
              | otherwise =  map (+ common)
                                 (0 : [pair i | i <- [1..halfN - 1]])
              where common = pair 0 + 2 * tenToThe halfN
          noTwos n
              | even n    = base
              | otherwise = concat [[p, p + tenToThe halfN] | p <- base]
              where pairsum xs = foldl' (+) (pair 0) (map pair xs)
                    base       = map pairsum (noTwosChoices !! (halfN - 1))

noTwosChoices = [concat [n `choices` k | k <- [0..min 3 n]] | n <- [0..]]

We explicitly generate the oneTwos values in a way that makes clear that there are n `div` 2 of them, and it's similarly clear that numNoTwos is correct. Life is beautiful, right?

Well, not quite. We're now eating up ten megabytes of memory instead of six and change; total allocation is up from 127 MB to 140 MB, and total time, according to the profiling output, is up by .03 seconds. Saving lists of lists of Ints will definitely add to the allocation, so we would expect that. We're really beating on pair and pairsum, so that's probably where to look to get resource usage pared (no word play intended) back down.

UPDATE: We didn't memoize pair; doing that took us back down to 0.2 seconds, cutting the difference back down to .01 seconds. Total allocation is back down to 130 MB, but memory is up somewhere between 10 and 11 MB.

UPDATE: I wonder whether I forgot something, perhaps -fllvm. Recompiling and rerunning shows 0.19 seconds run time in profiler output, and just about 8 MB of memory usage (as opposed to total allocation, still around 130 MB).

I hate to do something as crass as packing three values in an Int or maybe an Int32, but it's grating to keep a list of lists of Ints hanging around, too. I'll give it a try. (The results were slower and used more RAM. So much for that.)

Monday, June 17, 2013

Maybe we're saving the wrong thing

Maybe we should keep the lists of combinations around instead of the base values; that would get us out of the divMod business, though OTOH it means hanging on to those lists; as it stands, I bet that they're being garbage collected.

The problem is that the bitmap moves us back to O(n), where n is the number of digits, rather than a fixed amount of work (because there's a fixed upper bound on the number of nonzero digits). We could roll our own bit fields, and stuff up to three bitsIn (n `div` 2) values into an Int or an Integer (actually, Haskell has Int8, Int16, Int32, and Int64 types to choose from; 21 bits a shot would allow for up to two million digits, times two for both halves of the palindrome--that's a lot bigger than the second official data set needs), but that seems kind of cheesy.

I'll have to think about that some more. In the meantime, I restructured the code a little bit to make the code that generates Ys more clearly parallel to the code that just counts them. The counting code:

numNDigitYs :: Int -> Int

numNDigitYs 1 = 3
numNDigitYs n = numTwoTwos n + numOneTwos n + numNoTwos n
    where numTwoTwos n = if even n then 1 else 2
          numOneTwos n = if even n then 0 else n `div` 2
          numNoTwos  n = if even n then s else 2 * s
                         where h = n `div` 2 - 1
                               s = sum [h `choose` i | i <- [0..min h 3]]

The generating code:

nDigitYs :: Int -> [Integer]

nDigitYs 1 = [1,2,3]
nDigitYs n = sort (noTwos n ++ oneTwos n ++ twoTwos n)
    where twoTwos n
              | even n    = [twoTwosNoOnes]
              | otherwise = [twoTwosNoOnes,
                             twoTwosNoOnes + tenToThe (n `div` 2)]
              where twoTwosNoOnes = 2 * tenToThe (n - 1) + 2
          oneTwos n
              | even n    = []
              | otherwise =  map (+ 2 * tenToThe halfN)
                                 (justOnes n (min 1 (halfN - 1)))

              where halfN = n `div` 2
          noTwos n
              | even n    = base
              | otherwise = concat [[p, p + tenToThe halfN] | p <- base]
              where halfN = n `div` 2
                    base = justOnes n (min 3 (halfN - 1))

Yeah, I guess that shows I'm seriously at the point of diminishing returns.

A flashback and analogy

You've probably heard about how the notion of sum types (e.g. Algol 68 union s, Rust enum s, Haskell type s) and product types (e.g. tup...