Showing posts from June 2, 2013

"This time, fer shur!" --Bullwinkle Moose

OK. I really will go on to doing something else in Haskell; I'm thinking of something that will actually get me into the M-word.

(Though in the back of my mind I will still wonder whether there's a way to make justOnes generate values in ascending order other than just sorting the darn things...)

I want to give a little more data.

I went back and compiled and ran the C solution to the "Fair and Square" problem I'd grabbed, to get time output so I could remember something other than just the vague "about .4 seconds". The variance seems to be less than it is for "my" Haskell program, (I have to use the scare quotes given my the integer square root and semilattice search tree code) so I'll just give one output, as usual from the data set with values up to a googol, from the middle:

real    0m0.455s
user    0m0.432s
sys     0m0.020s

In comparison, the best value from a Haskell run:

real    0m0.224s
user    0m0.176s
sys     0m0.044s

If I had to guess…

Skating around the edges

Still thinking a bit about how to churn out those palindromes in order, or at least take advantage of what order there is. If you take a look at justOnes, it may be that each of the k values gives a hunk in ascending order, so you may be able to foldl' merge [] them and get them in order for cheaper... (UPDATE: no such luck) though that may not actually be cheaper than a single O(n log n) sort.

But now we take another peek at the output. The slowness of piecing together Strings to generate output has been noticed... oh, yes, it has been noticed... and people have come up with a way to cut down on it: using shows or something like it.

shows returns a function that takes a String as parameter and returns a String--the returned String is a printable representation of the thing (which must be in the ShowS type class) with the String parameter appended at the end. Instead of generating the String, you create a function that will generate it when called.

So.. remember

format :: (Int, Int)…

"Blessed rage for order, pale Ramon..."

So, the big cost center is nDigitYs. All it does is concatenate our three flavors of Y (the ones with no 2s, the ones with one 2, and the ones with two 2s) and sort them. Of the d-digit Ys, there are at most two with two 2s, and there are d `div` 2 with one 2. By far the majority are those with no 2s; for d == 49, of the 4,122 Ys, 4,122 - 2 - 24 = 4,096 have no 2s.

We know that the two-2 Ys are greater than all the others, so we could go with

nDigitYs n = sort (noTwos n ++ oneTwos n) ++ twoTwos n

...for what little good that does.

If you take a look at the result of oneTwos, you'll find that they're always in ascending order; we grind them out so that the optional 1 is first not there, and then in ascending order in the most signfiicant half, which suffices to order them. noTwos, on the other hand, isn't, for d > 7.  If there's some way to generate them in ascending order, we could go with

nDigitYs n = merge (noTwos n) (oneTwos n) ++ twoTwos n

(be sure to import Data.…

OK, I have to take one more look...

What can I say? I can sympathize with Lot's wife.

I have yet to show you the spiffy graphic you can get showing memory usage.

What does this tell us?

First, we can say that it looks like we're down about a megabyte from earlier. (Alas, they don't give the scale all the way up the vertical axis; if only they'd move the word "bytes" out of the way!) The other thing is, it looks like it's accumulating thunks until about halfway through when something finally demands evaluation and the RAM soufflé collapses.

(What's a thunk? The name goes back to the days when the world was new and people were figuring out how to generate code for Algol 60 programs. When the defining committee set up the rules for "call by name" they perhaps didn't realize the full consequences of their actions (videJensen's Device). Implementing it required generating a little function for each call by name parameter that returned a reference to it, and those little fun…

After the smoke clears...

OK. I think I've taken this problem as far as I can go... though someday I will set it up to read and write named files so I can run it in ghci. After all, this all started because I wanted to demonstrate that it's the algorithm, not the language or whether it's interpreted or compiled after reading Clif Reeder's "Ruby is Too Slow for Programming Competitions". On the other hand, I now have a Haskell program that's within spitting distance of running twice as fast as a C++ solution for the problem, taking a bit over 0.2 seconds to run the hairier official input file. Eight minutes, the time Code Jam gives you to run and send back results when it counts, is 480 seconds; I tend to doubt that interpreter overhead would come near a factor of 2,400, in which case the point is made.

This started with the less-than-noble motivation of "Well, I'll show him...", but quickly changed, for a couple of reasons.
The unalloyed pleasure of problem solving al…

O(n) good, O(log n) better

What we need is a list/array of b, b2, b4, b8, b16, ... to count base b digits. The initial loop unrolling should have given us the idea already.

powersOfTwo = map (2^) [0..]
bigTwos = map (2^) powersOfTwo
bigTens = map (10^) powersOfTwo

bitsIn n   = bDigits n bigTwos
digitsIn n = bDigits n bigTens

bDigits :: Integer -> [Integer] -> Int

bDigits n xs = bDigits' n 1
    where bDigits' n s = let upToN = takeWhile (<= n) xs
                         in case upToN of
                            [] -> s
                            _  -> bDigits' (n `div` (last upToN))
                                           (s + powersOfTwo !! (length upToN - 1))

Whew! OK, so does it help? Best observed time output:

real    0m0.227s
user    0m0.208s
sys     0m0.016s

Worst observed time output:

real    0m0.238s
user    0m0.212s
sys     0m0.024s

Profiling output claims 0.20 seconds runtime (still; the difference is a few "ticks" that aren't noticeable in the calculated time). It's…

A little more stuff

So, we know that things get slow when we actually have to generate some of the Ys for a range check. The range checking is now O(log n) where n is the number of Ys we have to check. twoTwos is trivial, so that leaves oneTwos and noTwos, and they mostly call justOnes. Of course there's also the sorting, the labeling with ordinal position, and the tree building. sort doesn't even appear in the .prof file, so it must be down in the noise (or perhaps considered a primitive; after all, we don't see the additions split out). What we do see in the "cost center" list are
nDigitYsfromListdigitsIn.digitsIn'justOnesjustOnes.pairfloorSqrt.floorSqrt'.y One small change to justOnes turned out to be surprisingly effective: the value returned was calculated as

[shell + sum (map pair c) | c <- innards]

It turns out to chop an easy 10% off the execution time, according to profiling output, to recast that as

[foldl' (+) shell (map pair c) | c <- innards]