Sunday, June 23, 2013

So, how am I being wasteful?

Let me count the ways:
  • We're keeping the combinations for noTwos (2 * k) and noTwos (2 * k + 1)around forever. How can we let go of them when both those have been calculated? The simplest way would be to generate nDigitYs (2 * k) and nDigitYs (2 * k + 1) at the same time, but that has the potential to make us do extra work. This is Haskell, after all; laziness is a virtue. There has to be a better way. 
  • What is up with pairSum? (I renamed it to fit convention.) It's using up 11% of the time and nearly 12% of storage allocations; the graph shows it accumulating two megabytes of heap. Are we piling up thunks?
Alas, this isn't as simple as the Real World Haskell Chapter 25 example. There you know that accumulating and counting items on a list, which is all the example program does, ought to be doable in a constant amount of memory. We, on the other hand, are accumulating things as we go. Ideally, all we accumulate would be
  • The trees for the various numbers of digits that we actually have to search in
  • The memoized partial sums of numNDigitYs
Of course, we are also keeping around powers of ten and two, and, right now, the lists of lists of Ints for noTwos.

The immediate temptation is to make numXs take three parameters and return two values, with the added input and output being the extranea that we want to keep around for a while (initially empty, of course, and the added output of one call being passed in as the added input for the next). That seems ugly, though; the exact opposite of information hiding. I'm sure someone's thought of this sort of situation and dealt with it; I just have to learn about it. In the meantime, there's still pairSum to optimize.

No comments:

Riddler Classic, May 23, 2020—Holy Mackerel!

Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...