Posts

Showing posts from May 26, 2013

So that's why they are taking time...

We have yet to peek further into the profiling output; it's overdue.

The "cost center" tells you the total time used in particular functions, and that helps find slow functions--but while that's a sufficient condition for being a cost center, it's not a necessary condition. It may just be a function that someone's calling a lot.

Perhaps that's the case with tenToThe. The call tree shows calls in context, and when we look at it we see the following (edited slightly due to line length):

     noTwos                       Main 184          43    0.8    2.0    31.9   28.6
      tenToThe                    Main 229          21    0.0    0.0     0.0    0.0
      noTwos.base                 Main 186          42    0.0    0.0    31.2   26.6
       justOnes                   Main 187          42    8.8   13.3    31.2   26.6
        justOnes.pair             Main 195       85582    7.3    6.4    17.3    6.4
         tenToThe                 Main 196      171164   10.0    …

Up to date... or not

Well... I have ghc 7.4.1. It may be that more recent versions of ghc do a better job of optimization; they're up to 7.6.something.

Bodhi Linux is based on LTS releases of Ubuntu, and it doesn't look like the 12.04 repositories have been keeping up. I'd hate to have to switch distributions to get the latest ghc, but it may come to that.

UPDATE: BWAHAHAHA! I thought that the more recent version was better because it was allocating half as much memory. It was, instead, that I had a 32-bit version in one case, 64-bit in the other. Now I get to decide whether to plow onward with Ubuntu 13.04, which I installed to get ghc 7.6.something but which is flaky in some respects, or drop back to Bodhi Linux.

String: elegant but deadly

In Haskell, String is a synonym for [Char], i.e. a list of Char. This is a very elegant way to deal with strings--but it's a killer for performance.

The recommended way around this is to use the ByteString type. It's not a drop-in replacement for String, sad to say; in particular replacing read and show are problematic. I lucked out on reading Integers, though the returned type is different, and I wimped out on generating the output. You also get to add an import line, in this case

import Data.ByteString.Char8 as B

and qualify some function names.Here's the revised main and supporting functions:

main = do
    s <- B.getContents
    let r = map format $ zip [1..] (map process (tail $ B.lines s))
    putStr (unlines r)

process :: B.ByteString -> Int

process line = case map B.readInteger (B.split ' ' line) of
               [Just (m, _), Just (n, _)] -> numXs m n
               _                          -> -1

format :: (Int, Int) -> String

format (lineNum, result…

Things that make you go "Hmmm..."

OK. I rolled my own main, with serious help from "Haskell I/O for Imperative Programmers"--and I hasten to add that if you, like me, are learning Haskell, run, do not walk, to this page and read it.

Doing I/O, at least of the sort that programs like this do, in Haskell requires an attitude adjustment if you want to work with Haskell rather than against it. If you're a shell scripter, you already have the idea: you've worked with pipes. Pipes are like function composition; see
"Unix Pipes and Pointless Functional Programming""Unix Pipe as Functional Language" So, here's my main. I toyed with using interact, but having to skip the first line (the one that tells you how many lines are left) made it problematic, as well as the issue of adding a "case number" for getting the output in the form that the problem statement requires.

-- A new main, heavily inspired by "Haskell I/O for Imperative Programmers"
-- http://www.haskell.org/h…

Talk about TMTOWTDI...

At least from my newbie POV, the variety of ways people set up the top levels of their "Fair and Square" programs is pretty amazing.

Some use the standard issue recursive versions of top-level loops you see in texts; some use replicateM. Some ignore the first line, some don't.

I have some reading to do.

The remaining time sinks

Counting Digits We unrolled digitsIn' before; perhaps we could use another shot. Add another couple of lines for base^8 and the cost centers change:

COST CENTRE            MODULE                %time %alloc

solve                  Main                   39.5   56.0
nDigitYs               Main                   11.2    8.6
fromList.(...)         SemilatticeSearchTree   8.2    9.1
tenToThe               Main                    7.9    0.0
justOnes.pair          Main                    4.2    3.1
justOnes               Main                    3.7    6.4
digitsIn.digitsIn'     Main                    3.0    4.1
floorSqrt.floorSqrt'.y Main                    2.8    1.9
meet                   SemilatticeSearchTree   2.3    0.0
satisfy                SemilatticeSearchTree   2.1    0.0
fromList               SemilatticeSearchTree   1.9    0.2
digitsIn               Main                    1.6    0.1
findFirst              SemilatticeSearchTree   1.2    0.0
choices                Main             …

Qapla'!

I finally got things in sync with van Laarhoven's module, and the output is correct. (Just to make sure, I went back to the Code Jam site and submitted the output. I think that being up until nearly 3:00 a.m. caused my initial reaction to
Judge's response for last submission: Correct. to be "Oh, [expletive]! The judge says I need to correct my last submission!" Then I realized it was an adjective, not a command, and relaxed.)

I'm a little bummed at using someone else's code for a nontrivial portion of the problem, but OTOH, I did learn more Haskell, which is one of my real goals.

But back to the results: the best time output is

real    0m0.361s
user    0m0.336s
sys     0m0.024s

and the worst is

real    0m0.377s
user    0m0.364s
sys     0m0.008s

Both are less than the 0.4 seconds I saw for the C++ solution I tried.

[We pause here to do a victory dance.]

Where are our cost centers, and how about that memory usage?

COST CENTRE            MODULE                %time %al…

Almost there...

I have something that compiles... but doesn't give the correct result. I suspect I've blown the interface to van Laarhoven's SemilatticeSearchTree module in a way that, while it passes syntactic muster, is keeping the search from working right. The presence of a bunch of zeroes in the output might indicate that I'm consistently getting back Nothing.

What does the time output (0.3240.347s real time) tell us about what we might see when it's correct? Well, it's certainly a lower bound--but how much longer will the correct version run? The buggy one has presumably built the trees, and is doing all the work, but if the maxima are taking it immediately to Nothing, we're only skipping a couple of O(log(n)) operations times a thousand input lines, where n is no bigger than a bit over 4,000 (the number of 49-digit Ys). I'm hoping the corrected version won't take all that much longer... but we can't know until we have it right. More news as it happens.