Tuesday, June 18, 2013

Further restructuring

It occurred to me that I could push the code even further, making the connection between counting and generation more apparent. Recall the counting code:

numNDigitYs 1 = 3
numNDigitYs n = numTwoTwos n + numOneTwos n + numNoTwos n
    where numTwoTwos n = if even n then 1 else 2
          numOneTwos n = if even n then 0 else n `div` 2
          numNoTwos  n = if even n then s else 2 * s
                         where h = n `div` 2 - 1
                               s = sum [h `choose` i | i <- [0..min h 3]]


Here's the new generation (which sounds like some cheesy 60s thing--sorry!); we've gotten rid of justOnes.

nDigitYs 1 = [1,2,3]
nDigitYs n = sort (noTwos n ++ oneTwos n ++ twoTwos n)
    where halfN  = n `div` 2
          pair i = tenToThe i + tenToThe (n - (i + 1))
          twoTwos n
              | even n    = [twoTwosNoOnes]
              | otherwise = [twoTwosNoOnes, twoTwosNoOnes + tenToThe halfN]
              where twoTwosNoOnes = 2 * tenToThe (n - 1) + 2
          oneTwos n
              | even n    = []
              | otherwise =  map (+ common)
                                 (0 : [pair i | i <- [1..halfN - 1]])
              where common = pair 0 + 2 * tenToThe halfN
          noTwos n
              | even n    = base
              | otherwise = concat [[p, p + tenToThe halfN] | p <- base]
              where pairsum xs = foldl' (+) (pair 0) (map pair xs)
                    base       = map pairsum (noTwosChoices !! (halfN - 1))


noTwosChoices = [concat [n `choices` k | k <- [0..min 3 n]] | n <- [0..]]

We explicitly generate the oneTwos values in a way that makes clear that there are n `div` 2 of them, and it's similarly clear that numNoTwos is correct. Life is beautiful, right?

Well, not quite. We're now eating up ten megabytes of memory instead of six and change; total allocation is up from 127 MB to 140 MB, and total time, according to the profiling output, is up by .03 seconds. Saving lists of lists of Ints will definitely add to the allocation, so we would expect that. We're really beating on pair and pairsum, so that's probably where to look to get resource usage pared (no word play intended) back down.

UPDATE: We didn't memoize pair; doing that took us back down to 0.2 seconds, cutting the difference back down to .01 seconds. Total allocation is back down to 130 MB, but memory is up somewhere between 10 and 11 MB.

UPDATE: I wonder whether I forgot something, perhaps -fllvm. Recompiling and rerunning shows 0.19 seconds run time in profiler output, and just about 8 MB of memory usage (as opposed to total allocation, still around 130 MB).

I hate to do something as crass as packing three values in an Int or maybe an Int32, but it's grating to keep a list of lists of Ints hanging around, too. I'll give it a try. (The results were slower and used more RAM. So much for that.)

No comments:

Riddler Classic, May 23, 2020—Holy Mackerel!

Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...