### Fun with palindromes, and the joys of Haskell

An item in

Commenters on the HN item pointed out

Anyway, I thought I'd add my voice to the chorus of "it's the algorithm, not the language" (or more accurately, if the algorithm isn't efficient, compiled vs. interpreted or high versus low level language doesn't matter) by writing in Haskell. The goal: writing code as clear as I can manage while still solving the problem quickly enough.

(I hasten to add that I am no Haskell expert, and I hope to see someone put Haskell to better, more idiomatic use while maintaining or improving clarity.)

Strictly speaking, I'm writing code that solves the main part of the problem:

When you look at the problem, some possible optimizations over the brute force method come to mind:

Fortunately, it's possible to do

Just constraining the digits to 0, 1, and 2 did speed things up somewhat; the resulting program managed to get up into the 48-digit Xs in about three minutes, but that's nowhere near fast enough to handle the largest test input, which has values up to 10

Here's the code I ended up with. I split out the one- and two-digit cases so that the noTwos, oneTwo, and twoTwos functions don't have to bother with them.

I should point out that

Now, the times mentioned above are for runs without main, under ghci, an interpreter, using a version that sorted the results of possibles to make the output easier to check against the values at "Palindromic Square Numbers". I should also mention that under ghci it was able (again, with the sort) to generate the list of "fair and square" numbers between 1 and 10

Profiling meant using the compiler, and what a difference that made! The first version of main didn't bother to read the command line arguments, and instead wired in 10

With the sort pulled, the time output was

The sort also turned out to be a major source of memory usage. With sort, it was using nearly four megabytes of RAM. Without, the profile showed it using more like 40K.

Now, you couldn't use the interpreted version in the straightforward way to successfully pass the Code Jam large and larger input tests; you'd want to take advantage of their giving the range of values, generate the list and then convert it to a structure that lends itself to quickly determining how many values are in a given range. (Sorted list, perhaps a search tree...)

The compiled version, on the other hand... the smaller large input file has no more than 10000 lines, confined to [1, 10

If every input line covered the whole range, that would be thirty seconds, plus the time for I/O--well within the eight minute range.

The larger large input file has no more than 1000 lines, confined to [1, 10

So, have we shown what we set out to show? Clearly compiled Haskell is far faster than interpreted Haskell. (To give a better comparison, since I ran the interpreted cases on my Eee 900A--the compiled version ran in about 2.5 seconds on the larger range on the Eee.) But that's not the point; the optimized algorithm could trivially run the first large data set even interpreted, and with the precalculation could probably do the larger large data set interpreted.

All that said, the revelation for me was this: it was incredibly easy to write the code, straightforward to test and debug in the interpreter, and I had only to add import lines and the simple main to be able to compile it. Total: sixty-three (nonblank) lines of code. It's helped along by Haskell having arbitrary-precision integers and list comprehensions, which I used liberally.

One could argue that this problem is one particularly well-suited to Haskell, but for me this was a sufficiently enjoyable task that I will do more with Haskell, and I urge you to give it a try too.

UPDATE: Looks like Ruby has Bignum, and various people have posted fairly concise-looking constructs to get the effect of list comprehensions. It might be fun to rewrite the above in Ruby.

UPDATE: Here are the solutions in Haskell from the Code Jam web site. Check out jadg's code; it's very impressively short. I'll be studying that one.

*Hacker News*caught my eye recently. (Ouch!) It linked to a post, "Ruby is Too Slow for Programming Competitions", by a gentleman named Clif Reeder. A summary: Mr. Reeder chose Ruby to try on a Google Code Jam qualification round problem, and found that his solution ran too slowly to meet the Code Jam time constraints (on the official runs, you have eight minutes from downloading the input to upload your output). A version written in the (compiled) Go language ran much faster.Commenters on the HN item pointed out

- Mr. Reeder didn't think through the problem fully; there were more possible optimizations. In fact, there's a list of 387 people who used Ruby to successfully handle either the small sample input (which has a four-minute time limit) or the large or really large inputs (with the eight-minute time limit).
- Like C, the size of int in Go may vary, so the Go version might or might not be able to handle the larger inputs correctly.

Anyway, I thought I'd add my voice to the chorus of "it's the algorithm, not the language" (or more accurately, if the algorithm isn't efficient, compiled vs. interpreted or high versus low level language doesn't matter) by writing in Haskell. The goal: writing code as clear as I can manage while still solving the problem quickly enough.

(I hasten to add that I am no Haskell expert, and I hope to see someone put Haskell to better, more idiomatic use while maintaining or improving clarity.)

Strictly speaking, I'm writing code that solves the main part of the problem:

More about the full problem later. For now, let's refer to an example of the kind of number we want as X, where X is a palindrome and also equal to the square of a palindrome Y, as the Code Jam analysis of the problem does.Given natural numbers M and N, how many numbers between M and N inclusive are both (base ten) palindromes and squares of numbers which are (base ten) palindromes?

When you look at the problem, some possible optimizations over the brute force method come to mind:

- Rather than look in [M, N] for Xs, look over values in [⌈sqrt(M)⌉, ⌊sqrt(N)⌋] for Ys. Mr. Reeder took advantage of this in his program; it helps a great deal.
- Rather than looking at each number in the range for palindromes, better to generate the palindromes in the range. A palindrome is determined by half its digits (unless it has an odd number of digits, in which case there's another digit in the middle that's not constrained to match any other digit), so that gives another reduction on the same order as (1).
- If Y squared has to be a palindrome, Y is still further constrained. Suppose Y = 4.....4. Then Y squared has the form n....6, where 16 ≤ n < 25, so it can't be a palindrome. A similar argument rules out five through nine, cutting the work down by two thirds.

^{14}in about a second on my Eee 900A netbook. That range includes all the values in the smaller large input file.Fortunately, it's possible to do

*much*better than that! Google's full analysis of the problem can be found on the Code Jam site. To summarize it:- Given the constraint on Y's leading and trailing digits, there can't be a carry out of the most significant digit when you add up the partial products of Y * Y to get X, so if Y has n digits, X has 2n - 1 digits, an odd number.
- In fact, there's no carry out of
*any*position in the sum of the partial products of Y and Y; that's both a necessary and sufficient condition for X = Y * Y to be a palindrome if Y is a palindrome. - The middle digit of X is thus (since Y is a palindrome) the sum of the squares of the digits of Y, and must be no bigger than 9.

- No 2s at all. They can have up to nine 1s if they have an odd number of digits, eight if they have an even number of digits.
- One 2. The 2 must be the middle digit, requiring an odd number of digits, and the rest of Y can have no more than four ones (if there were five, Y couldn't be a palindrome). Aside from 2, there must be at least two ones, the most and least significant digits.
- Two 2s. This allows at most one 1, and that only in the middle, which can only happen if there are an odd number of digits.

Just constraining the digits to 0, 1, and 2 did speed things up somewhat; the resulting program managed to get up into the 48-digit Xs in about three minutes, but that's nowhere near fast enough to handle the largest test input, which has values up to 10

^{100}, in the eight minutes Code Jam limits you to. To do it right, you have to seriously skip values that don't meet the sum of squares constraint.Here's the code I ended up with. I split out the one- and two-digit cases so that the noTwos, oneTwo, and twoTwos functions don't have to bother with them.

I should point out that

- I did this without the time constraints of Code Jam, over about eight days after the
*Hacker News*post - Of course, you're not seeing the mistakes I made over that eight days!
- The integer square root function came from an answer to a Stack Overflow question, which in turn was taken from Crandall and Pomerance, "Prime Numbers: a Computational Perspective"
- The trivial Main is adapted from Chapter 25 of
*Real World Haskell*having to do with profiling (about which more later)

**import Prelude**

import System.Environment

import Text.Printfimport System.Environment

import Text.Printf

**backwards :: Integer -> Integer**

backwards n = backwards' n 0

where backwards' n accum = if n < 10 then n + accum

else backwards' (n `div` 10) (10 * (n `mod` 10 + accum))

numPalindrome :: Integer -> Bool

numPalindrome n = (n == backwards n)backwards n = backwards' n 0

where backwards' n accum = if n < 10 then n + accum

else backwards' (n `div` 10) (10 * (n `mod` 10 + accum))

numPalindrome :: Integer -> Bool

numPalindrome n = (n == backwards n)

**twoTwos :: Integer -> [Integer]**

twoTwos nDigits =

let justTwos = 2 * 10 ^ (nDigits - 1) + 2

in

if even nDigits then [justTwos]

else [justTwos, justTwos + 10 ^ (nDigits `div` 2)]twoTwos nDigits =

let justTwos = 2 * 10 ^ (nDigits - 1) + 2

in

if even nDigits then [justTwos]

else [justTwos, justTwos + 10 ^ (nDigits `div` 2)]

**choose :: Integer -> Integer-> [Integer]**

nDigits `choose` nOnes =

if nOnes == 0 then [0]

else if nDigits == 1 then [1]

else if nDigits == nOnes then [sum [10 ^ n | n <- [0..nDigits - 1]]]

else concat [[0 + 10 * x | x <- (nDigits - 1) `choose` nOnes],

[1 + 10 * x | x <- (nDigits - 1) `choose` (nOnes - 1)]]

chooseRange :: Integer -> [Integer] -> [Integer]

chooseRange nDigits onesRange = concat [nDigits `choose` nOnes | nOnes <- onesRange, nOnes <= nDigits]nDigits `choose` nOnes =

if nOnes == 0 then [0]

else if nDigits == 1 then [1]

else if nDigits == nOnes then [sum [10 ^ n | n <- [0..nDigits - 1]]]

else concat [[0 + 10 * x | x <- (nDigits - 1) `choose` nOnes],

[1 + 10 * x | x <- (nDigits - 1) `choose` (nOnes - 1)]]

chooseRange :: Integer -> [Integer] -> [Integer]

chooseRange nDigits onesRange = concat [nDigits `choose` nOnes | nOnes <- onesRange, nOnes <= nDigits]

**oneTwo nDigits =**

if even nDigits then []

else

let msd = 10 ^ (nDigits `div` 2 - 1)

in [((msd + x) * 10 + 2) * 10 ^ (nDigits `div` 2) + backwards (msd + x) |

x <- chooseRange (nDigits `div` 2 - 1) [0..1]]

noTwos :: Integer -> [Integer]

noTwos nDigits =

let halfN = nDigits `div` 2

msd = 10 ^ (halfN - 1)

choices = chooseRange (halfN - 1) [0..3]

in

if even nDigits then [ (msd + x) * 10 ^ halfN + backwards (msd + x) | x <- choices]

else concat [[((msd + x) * 10 + 1) * 10 ^ halfN + backwards (msd + x),

(msd + x) * 10 * 10 ^ halfN + backwards (msd + x)] | x <- choices]if even nDigits then []

else

let msd = 10 ^ (nDigits `div` 2 - 1)

in [((msd + x) * 10 + 2) * 10 ^ (nDigits `div` 2) + backwards (msd + x) |

x <- chooseRange (nDigits `div` 2 - 1) [0..1]]

noTwos :: Integer -> [Integer]

noTwos nDigits =

let halfN = nDigits `div` 2

msd = 10 ^ (halfN - 1)

choices = chooseRange (halfN - 1) [0..3]

in

if even nDigits then [ (msd + x) * 10 ^ halfN + backwards (msd + x) | x <- choices]

else concat [[((msd + x) * 10 + 1) * 10 ^ halfN + backwards (msd + x),

(msd + x) * 10 * 10 ^ halfN + backwards (msd + x)] | x <- choices]

**possibles :: Integer -> [Integer]**

**possibles nDigits =**

if nDigits == 1 then [0..3]

else if nDigits == 2 then [11, 22]

else concat [noTwos nDigits, oneTwo nDigits, twoTwos nDigits]if nDigits == 1 then [0..3]

else if nDigits == 2 then [11, 22]

else concat [noTwos nDigits, oneTwo nDigits, twoTwos nDigits]

**digitsIn :: Integer -> Integer -> Integer**

digitsIn base n = digitsIn' n 1

where digitsIn' n count = if n < base then count else digitsIn' (n `div` base) (count + 1)digitsIn base n = digitsIn' n 1

where digitsIn' n count = if n < base then count else digitsIn' (n `div` base) (count + 1)

**candidates :: Integer -> Integer -> [Integer]**

**candidates m n = [x | x <- concat (map possibles [digitsIn 10 m..digitsIn 10 n]), x >= m, x <= n]**

**floorSqrt :: Integer -> Integer**

floorSqrt n = if n == 0 then 0 else floorSqrt' (2 ^ ((1 + digitsIn 2 n) `div` 2))

where floorSqrt' x =

let y = (x + n `div` x) `div` 2

in if y >= x then x else floorSqrt' y

ceilSqrt :: Integer -> Integer

ceilSqrt n =

let y = floorSqrt n

in if y * y == n then y else y + 1floorSqrt n = if n == 0 then 0 else floorSqrt' (2 ^ ((1 + digitsIn 2 n) `div` 2))

where floorSqrt' x =

let y = (x + n `div` x) `div` 2

in if y >= x then x else floorSqrt' y

ceilSqrt :: Integer -> Integer

ceilSqrt n =

let y = floorSqrt n

in if y * y == n then y else y + 1

**palSquares :: Integer -> Integer -> [Integer]**

palSquares m n = [x * x | x <- candidates (ceilSqrt m) (floorSqrt n)]palSquares m n = [x * x | x <- candidates (ceilSqrt m) (floorSqrt n)]

**main = do**

[d] <- map read `fmap` getArgs

printf "%d\n" (length (palSquares 1 d))[d] <- map read `fmap` getArgs

printf "%d\n" (length (palSquares 1 d))

Now, the times mentioned above are for runs without main, under ghci, an interpreter, using a version that sorted the results of possibles to make the output easier to check against the values at "Palindromic Square Numbers". I should also mention that under ghci it was able (again, with the sort) to generate the list of "fair and square" numbers between 1 and 10

^{100}in about 28 seconds on my 2.8 GHz Athlon II X4 630 processor. Some of that time was I/O.Profiling meant using the compiler, and what a difference that made! The first version of main didn't bother to read the command line arguments, and instead wired in 10

^{100}as the upper bound. Compiling with -O2, it ran fast enough that I made the change to read the command line arguments lest it have optimized away all the calculations. That done, time output for a run, again using 10^{100}as the upper bound, showed**real 0m0.495s****user 0m0.480s****sys 0m0.012s**With the sort pulled, the time output was

**real 0m0.389s****user 0m0.380s****sys 0m0.004s**The sort also turned out to be a major source of memory usage. With sort, it was using nearly four megabytes of RAM. Without, the profile showed it using more like 40K.

Now, you couldn't use the interpreted version in the straightforward way to successfully pass the Code Jam large and larger input tests; you'd want to take advantage of their giving the range of values, generate the list and then convert it to a structure that lends itself to quickly determining how many values are in a given range. (Sorted list, perhaps a search tree...)

The compiled version, on the other hand... the smaller large input file has no more than 10000 lines, confined to [1, 10

^{14}]. Results of time on that range:**real 0m0.003s**

user 0m0.004s

sys 0m0.000suser 0m0.004s

sys 0m0.000s

If every input line covered the whole range, that would be thirty seconds, plus the time for I/O--well within the eight minute range.

The larger large input file has no more than 1000 lines, confined to [1, 10

^{100}]. If every input line covered the whole ranges, that would be 389 seconds plus I/O time. That's cutting it rather close, but it's still possible.So, have we shown what we set out to show? Clearly compiled Haskell is far faster than interpreted Haskell. (To give a better comparison, since I ran the interpreted cases on my Eee 900A--the compiled version ran in about 2.5 seconds on the larger range on the Eee.) But that's not the point; the optimized algorithm could trivially run the first large data set even interpreted, and with the precalculation could probably do the larger large data set interpreted.

All that said, the revelation for me was this: it was incredibly easy to write the code, straightforward to test and debug in the interpreter, and I had only to add import lines and the simple main to be able to compile it. Total: sixty-three (nonblank) lines of code. It's helped along by Haskell having arbitrary-precision integers and list comprehensions, which I used liberally.

One could argue that this problem is one particularly well-suited to Haskell, but for me this was a sufficiently enjoyable task that I will do more with Haskell, and I urge you to give it a try too.

UPDATE: Looks like Ruby has Bignum, and various people have posted fairly concise-looking constructs to get the effect of list comprehensions. It might be fun to rewrite the above in Ruby.

UPDATE: Here are the solutions in Haskell from the Code Jam web site. Check out jadg's code; it's very impressively short. I'll be studying that one.

## Comments