Still thinking about how to keep those combinations (or palindromes, now that we see we can avoid some sorting if we go ahead and generate them) around only as long as necessary. One thing that's kind of tempting is to take advantage of what we know about when we can reuse these values--we can generate them for one of 2 * k and 2 * k + 1 and then reuse them (with some modification) for the other. So, why not generate the trees two at a time?
"Well, you don't know that you'll need both of them," you object. "The Code Jam people could make you waste your time by generating input that only asks about ranges corresponding to Ys with even numbers of digits (or odd, take your pick)." And you're right, they could. Our attempts to speed things up can be subverted... but I think this one is worth a try.
At least initially, I'd be inclined to generate a list of pairs of trees, with the first being for 2 * k digits, the second being for 2 * k + 1. (Yes, one-digit Ys are a special case again.) More news as it happens.
UPDATE: new best time output:
real 0m0.192s
user 0m0.160s
sys 0m0.028s
Total allocation is down to not quite 117 MB, compared with 124 MB before, and "maximum residency" is around 9.6 MB. GC time is down to 41.2% of total execution time. I guess that's better than almost half, but that still seems high. One thing that is gratifying: the first time I used -sstderr to save GC info, it listed almost 160 MB copied during GC and over 16 MB maximum residency. Now the maximum copying during GC is just 63 MB, less than half what it was before.
And that's using spread, which is expensive because it does Integer divides. We'll do that next, and then I really should try to ditch the gratuitous meets of values in the trees that we never search for. That should be pure gravy.
UPDATE: spread is worth it to avoid having to sort again. (Before we weren't taking advantage of that, so spread was overhead.)
random notes and thoughts, mostly about Haskell these days, of a rather past middle-aged programmer
Saturday, June 29, 2013
Improvements
Chaddaï Fouché kindly responded to a query I put out on the haskell-beginners mailing list, suggesting:
iterate :: (a -> a) -> a -> [a]
Give it a function f and a starting value s and it will hand you back
[s, f s, f . f s, f . f . f s, ...]
So, for example, rather than
powersOfTen = map (10 ^) [0..]
we can write
powersOfTen = iterate (10 *) 1
and instead of
bigTwos = map (2 ^) powersOfTwo
we can write
square n = n * n
bigTwos = iterate square 2
Not shabby, eh?
About those Vectors: they're a data structure that makes for O(1) (i.e. constant time) indexing, as opposed to the O(n) time for lists. There are two flavors: Data.Vector and Data.Vector.Unboxed. The unboxed version has lower overhead, but can't be used on all types.
So I added
import qualified Data.Vector as V
and changed a little bit of code:
memoPair = V.generate halfN (\i -> tenToThe i + tenToThe (n - 1 - i))
pair i = memoPair V.! i
pairSum v = V.foldl' (+) shell (V.map pair v)
choices :: Int -> Int-> [V.Vector Int]
m `choices` n
| n == 0 = [V.empty]
| m == n = [V.enumFromStepN m (-1) m]
| otherwise = [m `V.cons` c | c <- (m - 1) `choices` (n - 1)]
++ ((m - 1) `choices` n)
and indeed it helped. I hope it will help more soon; I can't have an unboxed vector of Integers (or is that a vector of unboxed Integers?), so if I want the choices Ints unboxed, I can't use the unboxed vector map; I'll have to roll my own function for that.
A bigger payoff, for now, had to do with sorting, or the minimizing thereof. It's easy to make oneTwos come out in order; noTwos is the gotcha. We did the folliowing:
nDigitYs n = (merge (oneTwos n) (noTwos n)) ++ twoTwos n
where twoTwos n
| even n = [twoShell]
| otherwise = [twoShell, twoShell + tenToThe halfN]
where twoShell = 2 * shell
oneTwos n
| even n = []
| otherwise = map (+ (shell + 2 * tenToThe halfN))
(0 : map pair [halfN - 1,halfN - 2..1])
noTwos n
| even n = base
| otherwise = merge base [p + tenToThe halfN | p <- base]
where base = sort $ map pairSum (noTwosChoices !! (halfN - 1))
and poof! The profiler output shows total time as 174 ticks, 0.17 seconds. The best time output is now
real 0m0.214s
user 0m0.176s
sys 0m0.028s
No reduction in the percentage of time used for garbage collection, though.
You know... now it may be worth keeping palindromes rather than combinations, because we would only have to sort them once instead of twice--and that might cut the memory usage as well.
- using iterate
- using Data.Vector or MemoTrie instead of lists where possible
iterate :: (a -> a) -> a -> [a]
Give it a function f and a starting value s and it will hand you back
[s, f s, f . f s, f . f . f s, ...]
So, for example, rather than
powersOfTen = map (10 ^) [0..]
we can write
powersOfTen = iterate (10 *) 1
and instead of
bigTwos = map (2 ^) powersOfTwo
we can write
square n = n * n
bigTwos = iterate square 2
Not shabby, eh?
About those Vectors: they're a data structure that makes for O(1) (i.e. constant time) indexing, as opposed to the O(n) time for lists. There are two flavors: Data.Vector and Data.Vector.Unboxed. The unboxed version has lower overhead, but can't be used on all types.
So I added
import qualified Data.Vector as V
and changed a little bit of code:
memoPair = V.generate halfN (\i -> tenToThe i + tenToThe (n - 1 - i))
pair i = memoPair V.! i
pairSum v = V.foldl' (+) shell (V.map pair v)
choices :: Int -> Int-> [V.Vector Int]
m `choices` n
| n == 0 = [V.empty]
| m == n = [V.enumFromStepN m (-1) m]
| otherwise = [m `V.cons` c | c <- (m - 1) `choices` (n - 1)]
++ ((m - 1) `choices` n)
and indeed it helped. I hope it will help more soon; I can't have an unboxed vector of Integers (or is that a vector of unboxed Integers?), so if I want the choices Ints unboxed, I can't use the unboxed vector map; I'll have to roll my own function for that.
A bigger payoff, for now, had to do with sorting, or the minimizing thereof. It's easy to make oneTwos come out in order; noTwos is the gotcha. We did the folliowing:
nDigitYs n = (merge (oneTwos n) (noTwos n)) ++ twoTwos n
where twoTwos n
| even n = [twoShell]
| otherwise = [twoShell, twoShell + tenToThe halfN]
where twoShell = 2 * shell
oneTwos n
| even n = []
| otherwise = map (+ (shell + 2 * tenToThe halfN))
(0 : map pair [halfN - 1,halfN - 2..1])
noTwos n
| even n = base
| otherwise = merge base [p + tenToThe halfN | p <- base]
where base = sort $ map pairSum (noTwosChoices !! (halfN - 1))
and poof! The profiler output shows total time as 174 ticks, 0.17 seconds. The best time output is now
real 0m0.214s
user 0m0.176s
sys 0m0.028s
No reduction in the percentage of time used for garbage collection, though.
You know... now it may be worth keeping palindromes rather than combinations, because we would only have to sort them once instead of twice--and that might cut the memory usage as well.
Sunday, June 23, 2013
So, how am I being wasteful?
Let me count the ways:
The immediate temptation is to make numXs take three parameters and return two values, with the added input and output being the extranea that we want to keep around for a while (initially empty, of course, and the added output of one call being passed in as the added input for the next). That seems ugly, though; the exact opposite of information hiding. I'm sure someone's thought of this sort of situation and dealt with it; I just have to learn about it. In the meantime, there's still pairSum to optimize.
- We're keeping the combinations for noTwos (2 * k) and noTwos (2 * k + 1)around forever. How can we let go of them when both those have been calculated? The simplest way would be to generate nDigitYs (2 * k) and nDigitYs (2 * k + 1) at the same time, but that has the potential to make us do extra work. This is Haskell, after all; laziness is a virtue. There has to be a better way.
- What is up with pairSum? (I renamed it to fit convention.) It's using up 11% of the time and nearly 12% of storage allocations; the graph shows it accumulating two megabytes of heap. Are we piling up thunks?
- The trees for the various numbers of digits that we actually have to search in
- The memoized partial sums of numNDigitYs
The immediate temptation is to make numXs take three parameters and return two values, with the added input and output being the extranea that we want to keep around for a while (initially empty, of course, and the added output of one call being passed in as the added input for the next). That seems ugly, though; the exact opposite of information hiding. I'm sure someone's thought of this sort of situation and dealt with it; I just have to learn about it. In the meantime, there's still pairSum to optimize.
Subscribe to:
Posts (Atom)
Riddler Classic, May 23, 2020—Holy Mackerel!
Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...
-
Back in the Cretaceous era I worked at the University of Oklahoma as a student assistant at Remote 1. OU was a [shudder] IBM big iron shop a...
-
You've probably heard about how the notion of sum types (e.g. Algol 68 union s, Rust enum s, Haskell type s) and product types (e.g. tup...
-
Verbal Abuse as Entertainment When I grew up, my parents always told me that there was a sort of person who needed to tear down others t...