Monday, May 27, 2013

The remaining time sinks

Counting Digits

We unrolled digitsIn' before; perhaps we could use another shot. Add another couple of lines for base^8 and the cost centers change:

COST CENTRE            MODULE                %time %alloc

solve                  Main                   39.5   56.0
nDigitYs               Main                   11.2    8.6
fromList.(...)         SemilatticeSearchTree   8.2    9.1
tenToThe               Main                    7.9    0.0
justOnes.pair          Main                    4.2    3.1
justOnes               Main                    3.7    6.4
digitsIn.digitsIn'     Main                    3.0    4.1
floorSqrt.floorSqrt'.y Main                    2.8    1.9
meet                   SemilatticeSearchTree   2.3    0.0
satisfy                SemilatticeSearchTree   2.1    0.0
fromList               SemilatticeSearchTree   1.9    0.2
digitsIn               Main                    1.6    0.1
findFirst              SemilatticeSearchTree   1.2    0.0
choices                Main                    0.7    2.9
mkBranch               SemilatticeSearchTree   0.7    1.1
yTree                  Main                    0.5    1.4
meet                   SemilatticeSearchTree   0.5    1.2


Not shabby. It's down from 11.5% of the time to 3%, and 7.6% of allocations to 4.1%. If we were hard core, we'd go ahead and do specific base 10 and base 2 versions and avoid recalculating the powers of the base, but there are bigger fish to fry, especially the elephant in the room (metaphor blender set to purée!), solve.

I/O Bound?

So here's that outer layer, courtesy of jagd:

solve i num = do
    [a, b] <- getLine >>= return . map read . words
    putStrLn $ "Case #" ++ show i ++ ": " ++ (show $ numXs a b)
    if i < num
       then solve (i+1) num
       else return ()


main = do
   num <- getLine >>= return.read
   solve 1 num


The input is reminiscent of the old days of FORTRAN 66 where, rather than trying to detect end of file, you'd start your input card deck with a card that told your code how many more cards there were to read.

We've moved from generating and testing a huge list of palindromes to the point that the above code takes two fifths of the CPU time and does nearly three fifths of the trips to the heap. What is this doing to take so much time with an input of just 1,001 lines comprising a little over 160K of data?

Once I thought it was the repeated walks to concatenate up a String to feed to putStrLn, so I broke the one putStrLn down into three putStr calls and a putStrLn, but it made no difference that I could see, so I suspect it's the input.

I've heard that one issue is the use of String, a synonym for [Char], and that there's a ByteString type that carries less overhead... but that doesn't suggest anything, except maybe trying to ask for the flavor of getLine that hands back a ByteString. I'll investigate that, and also take a look at how some of the other posted solutions read and parse the input, to see what I can learn. It bugs the heck out of me to think that I'm spending that much time and taking so many trips to the well to read such a dinky file.

UPDATE: Out of idle curiosity, I ran the first large input file, 10,000 ranges bounded above by 1014. time output:

real    0m0.300s
user    0m0.288s
sys     0m0.008s


The profiler output shows solve taking not quite three fourths of the time and doing 90% of the allocation! There's got to be a better way.

No comments:

Riddler Classic, May 23, 2020—Holy Mackerel!

Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...