Saturday, June 15, 2013

Nomenclature, and data structures

First, I should correct my terminology. All computers are binary these days; Setun was a one-shot, and the days of BCD-based computers are long gone.

This is the case of noTwos, and we've already dealt with the one case where a 3 can appear, so all the digits are either zero or one. So there's an obvious correspondence between these decimal palindromes and binary palindromes, pairing palindromes whose printable representations in their respective bases are equal.

Of course, n-digit binary numbers map exactly to the elements of the power set of a set of n elements, so we can represent our generated combinations (vide the function choices) as binary numbers. This is pretty tempting as an alternative to a [Int], but I don't want to iterate over all those bits to see which are set. After all, if we're just doing half of the palindrome, and we know the most significant digit has to be 1, that leaves at most three bits on.

Maybe we can use a trick: consider a number n of a type in the Bits class. What can we say about

n .&. (n - 1)?

(The Haskell bitwise "and" and "or" are a little funky-looking.)

Well, if n == 0, the result is zero; otherwise, n has some bits set, and one of them is the least-significant bit. n - 1 will have a zero in that position, and the more significant bits will be unchanged. The "and" gets rid of the trailing ones in case the least significant bit of n isn't in the ones position.

So, you can write an lsb function:

import Data.Bits

lsb :: (Bits a) => a -> a

lsb n = n `xor` (n .&. (n - 1))

and then peel the bits off one at a time. To do that, we shouldn't just give back the LSB, because we're generating the next value in the sequence while we're at it, so

bitStrip :: (Bits a) => a -> (a, a)

bitStrip n = (lsb, leftovers)
    where  leftovers = n .&. (n - 1)
           lsb       = n `xor` leftovers

and we can pull them off until we end up with zero...but we still have to map over to base 10; that lsb is 2^i for some i. I'm not sure whether the LSB trick will help any.

Thursday, June 13, 2013

Of course there's a pattern

I can't believe I didn't see it before. Here's the deal:
  • You can get from the base values for 2 * k digits to those for 2 * k + 1 digits as we described earlier.
  • You can get from the base values for 2 * k digits to some of those for 2 * k + 2 digits.
If you take all the base values for 2 * k digits and make a two-digit gap between their halves, you have all the base values for 2 * k + 2 digits that don't have ones in the gap. The ones that do have ones in the gap have, outside the gap, the halves of 2 * k digit palindromes that have one less one per half than the base values for 2 * k digits, because only that way do you get ones left over to fill the gap.

The 65536-dollar question, then, is whether it's worthwhile to generate them this way. (And of course, you can generate the ones with one less one per half for 2 * k digits from those with two less ones per half for 2 * k - 2 digits. That doesn't go on forever, to be sure; since the added ones per half is no bigger than three, you're down to the base case of 100...001 pretty quickly. Again, though, is it worth it?)

One thing to consider that might help out: doing justOnes and noTwos sorts of things purely in binary and moving back to decimal only at the very end. If we take this route, it would also be worth dealing purely with the upper halves of the no-one Ys until it's time to convert. Consider: dealing with the Ys takes you down from at most 100 decimal digits to at most 50 decimal digits. Dealing with just the upper half takes you down to at most 25 decimal digits. Doing the stuff where the decimal digits are restricted to ones and zeros means you can operate on values with at most 25 bits until the very end. Save the arbitrary precision for where you need it, right? Even if, like me, you kind of hate setting arbitrary limits, in practice it will mean that your Integers will actually, for purposes of this problem, be small enough that operations on them will be, if they're implemented by a list or array of fixed-size hunks, the fastest kind, because there will only be one hunk to mess with.

(OK, I admit I've limited some things in my code to Int rather than Integer. OTOH, I think the (29-bit) Int values suffice, since there are only something like 51,000 of these palindromes up to a googol, for heaven's sake, I suspect, though I haven't calculated, that there aren't 2^29, or around 512 million of the palindromes, until a pretty darned big upper bound. OK, maybe it's 256 million--signed, right?--but that's still on up there.)

I'll have to think about this some more.

UPDATE: I pulled the counting routines, tweaked them to take and return Integer, and then did

last $ takeWhile (\(x, y) -> y <= 2^29) (zip [1..] nDigitYsPartialSums)

Turns out you have to get past 10^515 before the number of Ys risks going past 2^29 (and hence needing an Integer to count them), and thus the corresponding Xs would be up around 10^1030.

Not sure whether it helped

With that change, we now have the following for noTwos and associated functions. (Oh, yeah... we made the one-digit case special as we did for counting the d-digit Ys, for the same reason, i.e. putting the special case in one place.)

evenNoTwos = [justOnes n (min 3 (n `div` 2 - 1)) | n <- [2,4..]]

noTwos :: Int -> [Integer]

noTwos n
    | even n     = base
    | otherwise  = concat [[p, p + tenToThe halfN] | p <- map spread base]
    where halfN    = n `div` 2
          base     = evenNoTwos !! (halfN - 1)
          spread x = let (a, b) = x `divMod` (tenToThe halfN)
                     in a * tenToThe (halfN + 1) + b

Did it make a difference? Yes, but... it was a bit slower. By profiling output, 193 ticks instead of 187; time output looked worse, with the best apparently

real    0m0.232s
user    0m0.200s
sys     0m0.028s

Recall that for the previous version, the best was

real    0m0.224s
user    0m0.176s
sys     0m0.044s

On the other hand, looking deeper into the profiler output, before, noTwos and things it called was eating 19.3% of the CPU time, while now that's 7.8%, but to be fair we should include evenNoTwos, which snarfs 7.3%--but still that's a total of 15.1%, which should be a win. Despite total allocation being down, actual memory usage is up, from maybe 6.25 MB to close to 7 MB; we are, after all, keeping more stuff around.

Wednesday, June 12, 2013

Meanwhile, I'm still thinking...

I took a quick look at choices, and I don't see an obvious way to generate them in what would correspond to ascending order. (Though that doesn't mean there isn't one.)

One thing we can do, though... take a look at noTwos, the source of by far the most of the d-digit Ys for all but the smallest d values.

noTwos n
    | n == 1     = [1,3]
    | even n     = base
    | otherwise  = concat [[p, p + tenToThe halfN] | p <- base]
    where halfN = n `div` 2
          base = justOnes n (min 3 (halfN - 1))

In particular, look at how base is calculated. Suppose n == 2 * k for some k. Then for n and n + 1, halfN - 1 has the same value and thus min 3 (halfN - 1) has the same value. Sure, n is different, but that just means for n + 1 there's a one digit gap left to either put a one in or leave at zero. Check it out.

*Main> justOnes 2 0
*Main> justOnes 3 0
*Main> justOnes 4 1
*Main> justOnes 5 1
*Main> justOnes 6 2
*Main> justOnes 7 2

So, corresponding to a base value x for noTwos (2 * k), the corresponding value for noTwos (2 * k + 1) is

let (a, b) = x `divMod` (tenToThe k)
in a * tenToThe (k + 1) + b

Less work than going through justOnes, I'd say... and it applies whether we figure out a clever way to generate the values in order or not. Note also that this mapping preserves numerical order, so if we do find a way to cleverly generate these values in order, taking advantage of this mapping won't mess that up.

We can save the base values for even numbers, though it would be nice if we could somehow notice when we do their successors and render them garbage collectable; otherwise that's 15,275 Integers hanging around in lists. Haskell has some memoization types and functions around that go beyond the simple lists as arrays we've been doing, but I'm not sure whether they support something like that.

"You can see a lot by just looking" --Yogi Berra

There's a reason Code Jam urges one to look at examples--in this problem, it points you towards the constraint on the Ys that I didn't see because I only considered the most/least significant digit. And now, we'll see whether it can point us at how we might generate noTwos in ascending numerical order.

noTwos, like oneTwos, is based on the result of justOnes. For now, at least, let's ignore the case of an odd number of digits; for them, justOnes simply returns the same values as for the next lower number of digits with a zero in the middle. Now, to take a look at  

putStr $ unlines $ map show (sort (justOnes (2 * n) (min (n - 1) 3))

for some values of n.  Hey, wait; these are palindromes, so let's just look at the top half. After all, the rest is redundant, and it's the top half that determines the ordering.









Oops. 10 breaks the pattern, because you can't have five ones. 12 shows a skip in the middle where there would otherwise be five ones, and as n increases, we'll presumably see more of those.


110000 <-

So, if we pretend these are binary numbers, up to eight digits (total), they go from 10...0 to 11..1. Past eight digits, they go from 10...0 to 11110..0, skipping any value that has more than four ones.

So there is a pattern. How to generate it efficiently?

UPDATE: I'm not sure there is a way; it's tempting to take the interval including all the numbers and then filter (\x -> popCount x <= o + 1) but that's a lot of numbers to generate and then mostly throw away; up at the high end for numbers up to a googol, that'd be around 28 million values that you'd grind out and then throw away all but a few thousand... and then you get to create the palindrome from the top half and switch back to base 10 on top of that. I don't think that's going to cut it... so, back to getting the list of lists of ones to come out in what corresponds to ascending order for the values justOnes generates.

Tuesday, June 11, 2013

Down with Gray Text!

It wasn't that long ago that having white text on a black background was le dernier cri. (Maybe it still is in some circles; a web page on the preannounced new Mac Pro was done up that way. After five seconds trying to read it on my phone, I gave it up as a lost cause.)

It seems to have been replaced lately by gray--in some cases pretty darned light gray--text on a white background.

Do web designers not care about whether anyone can read the pages they create?

Must-read about three monads

...and nary a burrito in sight. "Three Useful Monads"

Sunday, June 09, 2013

Overkill (not really...)

So I thought that

ysInDigitRange d1 d2 = sum [numNDigitYs d | d <- [d1..d2]]

did some needless recalculation, and tried

memoNumNDigitYs = map numNDigitYs [1..]
nDigitYsPartialSums = scanl (+) 0 memoNumNDigitYs

ysInDigitRange d1 d2 = nDigitYsPartialSums !! d2 - nDigitYsPartialSums !! (d1 - 1)

scanl? Ah, that's like foldl, but instead of giving you the final result of the fold, it gives you a list: at the head is the "initial" value, followed by the results of the operation at each stage. nDigitYsPartialSums !! i is thus equal to

sum map numNDigitYs [1..i]

so that the two formulations of ysInDigitRange give the same result. Trying it, though, showed the scanl version taking a little more time, a little less total allocation--but actually allocating about nine megabytes! I didn't expect that. Just goes to show that not everything you think will be an optimization actually is.

UPDATE: OK, now I'm puzzled. The profiler output shows ysInDigitRange being invoked 325 times. Ditto for numNDigitYs, but... but... the only calls to numNDigitYs should be where values in memoNumNDigitYs are calculated, and we shouldn't ask for more than 50 of those, so what's with the other 275 invocations?

UPDATE: Comparing profiler output showed what I was overlooking. Before this change, there were a lot more calls to numNDigitYs. We're talking 2,732 + 325 instead of just 50 + 325, down by a factor of more than eight. (OK, that's where the 50 went, but where do the 325 come from?)

Moreover,  the time percentage of the top cost centers falls off a lot quicker with the change. In the past, that's tended to happen along with a decrease in execution time, but here it's gone up a bit... and why on earth does keeping around what should be just an extra 101 Integers cause such a jump in memory usage?! Time to generate some of those other memory graphs to see whether we can figure it out.

D'Oh! I forgot... I'd switched to -fllvm. LLVM is the abstract machine used as an intermediate form by clang, and if you have it installed, you can ask ghc to use it. That, or rather not doing that, made the difference in memory usage and time. We're back to a bit over 6 MB, and with profile output claiming .19 seconds. Best time output:

real    0m0.227s
user    0m0.184s
sys     0m0.040s

Worst time output:

real    0m0.233s
user    0m0.220s
sys     0m0.008s

I'm amazed that llvm makes that much difference. (Compilation takes a bit longer, but darn, it's worth it.)

A flashback and analogy

You've probably heard about how the notion of sum types (e.g. Algol 68 union s, Rust enum s, Haskell type s) and product types (e.g. tup...