**pairsum**is grabbing so much RAM, let's contemplate

**floorSqrt**. We're just calling it 2,000 times, and yet it's collectively taking up over six percent of the CPU time and five percent of allocation?!

Let's remind ourselves of

**floorSqrt**(tweaked because I forgot that I'd memoized powers of two)

**floorSqrt :: Integer -> Integer**

floorSqrt 0 = 0

floorSqrt n = floorSqrt' (twoToThe ((1 + bitsIn n) `div` 2))

where floorSqrt' x =

let y = (x + n `div` x) `div` 2

in if y >= x then x else floorSqrt' y

floorSqrt 0 = 0

floorSqrt n = floorSqrt' (twoToThe ((1 + bitsIn n) `div` 2))

where floorSqrt' x =

let y = (x + n `div` x) `div` 2

in if y >= x then x else floorSqrt' y

This is a special case of the Newton-Raphson method that was known to the ancient Greeks. We treat zero specially to avoid one of the gotchas of Newton-Raphson, but aside from that, this particular example is well-behaved, and has what they call

*quadratic convergence*: once you're sufficiently close, each successive guess is good to twice as many digits as the one before. Our first guess is good to one bit, so the next is good to two bits, then four, then eight, etc. We're dealing with Xs in ranges up to a googol, or around 2

^{330}, which would me no more than nine iterations per value. Checking the profiling output, we see a total of 7,175 + 7,472 = 14,647 calls to

**floorSqrt'**, which is within our constraint, but darned close to it, so the inputs are tending to the high end of the range.

I kind of hate to cheat, but let's see what happens with

**floorSqrt n = floorSqrt' $ (floor. sqrt . fromIntegral) n**

[pause to compile, run, check profile output, and check... uh-oh.]

What happens is that it doesn't give the correct results; the output is not correct, which is weird. You'd think that would give you 52 good bits, and thus just need three iterations, but then, look at that termination condition. There has to be something about it that needs that first trial value to make it work, so you can't just plug in another, even better, guess. Taking the first value from the large input, the

**floorSqrt**we've been using returns

**10000010000100100100100001000000**

while using the supposedly better first guess gives us

**10000010000100100017503961350144**

We could change the function to take ceil log2 (# bits in n / 52) iterations, but I think I should get back to dealing with memory usage and

**pairsum**. (Even with that change you probably would end up flipping a coin for whether you got the floor or the ceiling of the square root of n.)