Saturday, June 01, 2013

So that's why they are taking time...

We have yet to peek further into the profiling output; it's overdue.

The "cost center" tells you the total time used in particular functions, and that helps find slow functions--but while that's a sufficient condition for being a cost center, it's not a necessary condition. It may just be a function that someone's calling a lot.

Perhaps that's the case with tenToThe. The call tree shows calls in context, and when we look at it we see the following (edited slightly due to line length):

     noTwos                       Main 184          43    0.8    2.0    31.9   28.6
      tenToThe                    Main 229          21    0.0    0.0     0.0    0.0
      noTwos.base                 Main 186          42    0.0    0.0    31.2   26.6
       justOnes                   Main 187          42    8.8   13.3    31.2   26.6
        justOnes.pair             Main 195       85582    7.3    6.4    17.3    6.4
         tenToThe                 Main 196      171164   10.0    0.0    10.0    0.0
        justOnes.shell            Main 191          42    0.0    0.0     0.0    0.0
         tenToThe                 Main 192          42    0.0    0.0     0.0    0.0
        justOnes.halfN            Main 190          42    0.0    0.0     0.0    0.0
        justOnes.innards          Main 188          42    0.4    1.0     5.0    7.0
         choices                  Main 189       60804    4.6    5.9     4.6    5.9
      noTwos.halfN                Main 185          42    0.0    0.0     0.0    0.0


pair is the hot spot for tenToThe calls. Let's see what we can do about pair.

Originally:

justOnes :: Int -> Int -> [Integer]

justOnes n o =
    let halfN = n `div` 2
        shell = tenToThe (n - 1) + 1
        innards = concat [(halfN - 1) `choices` k | k <- [0..o]]
        pair i = tenToThe i + tenToThe (n - (i + 1))
    in [shell + sum (map pair c) | c <- innards]


First time I ever used "innards" in a program! Let's memoize pair (are you starting to notice a theme here?), giving

        ascending  = take (halfN - 1) (drop 1 powersOfTen)
        descending = take (halfN - 1) (reverse (take (n - 1) powersOfTen))
        memoPair = zipWith (+) ascending descending
        pair i = memoPair !! (i - 1)


zipWith generalizes zip; in fact, you could write zip as

zipWith (\x y -> (x, y))

Results? Best observed time output:

real    0m0.234s
user    0m0.216s
sys     0m0.016s


Worst observed time output:

real    0m0.242s
user    0m0.228s
sys     0m0.012s


And the cost centers:

COST CENTRE              MODULE                %time %alloc

nDigitYs                 Main                   21.8   19.1
fromList.(...)           SemilatticeSearchTree  11.4   20.2
justOnes                 Main                   10.0   14.2
digitsIn.digitsIn'       Main                    9.5    8.9
fromList                 SemilatticeSearchTree   4.7    0.5
justOnes.pair            Main                    4.3    0.0
choices                  Main                    4.3    6.4
process                  Main                    3.8    8.3
floorSqrt.floorSqrt'.y   Main                    3.8    4.2
digitsIn                 Main                    2.4    0.2
satisfy                  SemilatticeSearchTree   2.4    0.6
mkBranch                 SemilatticeSearchTree   2.4    2.4
choose                   Main                    1.9    2.2
meet                     SemilatticeSearchTree   1.9    0.0
satisfy                  SemilatticeSearchTree   1.9    0.0
findFirst                SemilatticeSearchTree   1.9    0.0
numNDigitYs.numNoTwos.n' Main                    1.4    0.9
noTwos                   Main                    0.9    2.1
meet                     SemilatticeSearchTree   0.9    2.6
yTree                    Main                    0.5    3.1
justOnes.innards         Main                    0.0    1.1


tenToThe is out of the running, and other things are moving up in percentage of time. Even with the new list, the total allocation is down from about 165 MB to 155 MB.

It's important to be guided by data. Looking at another part of the code,

noTwos :: Int -> [Integer]

noTwos n
    | n == 1     = [1,3]
    | even n     = base
    | otherwise  = concat [[p, p + tenToThe halfN] | p <- base]
    where halfN = n `div` 2
          base = justOnes n (min 3 (halfN - 1))


The otherwise (actually odd n) case seems like it's doing a bunch of work, creating a bunch of two-item lists and concatenating them... but changing it to

    | otherwise  = base ++ [p + tenToThe halfN | p <- base]

turrned out to be slower.

I would say that this is the first optimization I've done that is arguably less than clear. (OK, maybe the switch from String to ByteString, but I would disagree.)

No comments:

Riddler Classic, May 23, 2020—Holy Mackerel!

Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...