Thursday, July 04, 2013

"Evelyn was a modified dog..."

So, how can we modify the tree to do what we'd like?

As the code stands, it's set up to say that if a is in the type class Semilattice, you can create a SearchTree of as that you can efficiently search. What we want to say is that if you have some type a that has a function key :: a -> b where b is in the type class Semilattice, you can create a SearchTree of as that you can efficiently search based on the result of key. Then, if we have a list of (palindrome, position) pairs, our key function is just fst (the function that gives you the first item in a pair). Now, how to express that in Haskell?

UPDATE: come to think of it, wouldn't that subsume the existing cases? For them,

let key = id
UPDATE: OK, I should have remembered. You can create relationships between type classes. (Take a look at this diagram of relationships among types and type classes in the Standard Prelude.) So, it looks like I can say something like this:

{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}

class Semilattice a where
    meet :: a -> a -> a

class Semilattice k => Keyed r k | r -> k where
   key :: r -> k

The first type class definition is straight from Mr. van Laarhoven's code; a type is a Semilattice type if it has a meet function which satisfies the semilattice requirements for meet.

Keyed is a "multiparameter type class"; it sets up a relationship between a type r (intended to suggest "record") and a type k (intended to suggest "key"). "r -> k" is a "functional dependency". Those two features aren't in the Haskell 98 standard, but GHC supports them (if they're enabled in a pragma) and I suspect they're in Haskell'.

(Historical aside: you might not realize that the preceding paragraphs use terms that originate with Algol 68: "standard prelude" and "pragma". Algol 68 is a language that suffered from a lot of undeserved bad press; you really should read C.H. Lindsey's excellent paper on its history, and check out Algol 68 Genie. For those of you interested in functional programming, I should note that partial parametrization was up for addition to the standard and is implemented in Algol 68 Genie.)

So, I will proceed on this basis. It should be a simple rewrite of the Semilattice tree code, though I wish there were a way to have one version of it to do everything; code reuse is a Good Thing. The above lines make it past ghci, so I hope it's a good start. (Then, of course, the question is whether it really does help the Fair and Square program! OTOH, in a way I don't care, because the real goal was to learn more Haskell, not to mention that I was just lucky that the original blog post gave an example in which the Semilattice type was a pair.)

Tuesday, July 02, 2013

What next?

Yesterday it occurred to me that the profiling output might be more accurate if I just asked for -p, rather than always asking for the heap tracking. The first part of the profile output:

    Mon Jul  1 22:23 2013 Time and Allocation Profiling Report  (Final)

       ultimatepalindrome20 +RTS -p -RTS

    total time  =        0.17 secs   (167 ticks @ 1000 us, 1 processor)
    total alloc = 116,700,576 bytes  (excludes profiling overheads)

COST CENTRE            MODULE                %time %alloc

fromList.(...)         SemilatticeSearchTree  19.2   26.8
ysPair.noTwos          Main                   11.4    8.9
floorSqrt.floorSqrt'.y Main                   10.2    5.6
choices                Main                    7.8   10.7
process                Main                    7.2   11.0
fromList               SemilatticeSearchTree   6.6    0.6
ysPair.spread.(...)    Main                    4.8    2.4
ysPair.noTwos'         Main                    4.2    4.2
meet                   SemilatticeSearchTree   3.6    0.0
satisfy                SemilatticeSearchTree   3.0    0.0
main                   Main                    1.8    1.0
numYs                  Main                    1.8    0.1
ysPair.pairSum         Main                    1.8    3.5
makeYTree              Main                    1.8    4.1
bDigits                Main                    1.8    0.0
meet                   SemilatticeSearchTree   1.8    3.5
mkBranch               SemilatticeSearchTree   1.8    3.2
ysPair.noTwosChoices   Main                    1.2    0.7
bDigits.bDigits'       Main                    1.2    1.9
bound                  SemilatticeSearchTree   1.2    0.0
ysPair.noTwos'.base    Main                    0.6    1.2
ysPair.spread          Main                    0.6    2.5
ysPair.twoNPlus1List   Main                    0.0    2.9

The cost of ysPair is spread out over several lines; add them up and you get 32.4% of the time (choices count; there's only one caller) and 37% of the allocation... but then, the rest of the profile does the adding for us; let's check it out. OK, I was pretty close. fromList with its descendants takes up about another third of the time and allocation.

floorSqrt and ceilSqrt should be low-hanging fruit. As we've said before, it converges quadratically, doubling the number of valid bits each time around... but for numbers of up to 330 bits, that's as many as nine iterations starting from just one good bit as floorSqrt does. So, the temptation is to use the double precision square root directly where possible, and use it as a first approximation otherwise. With 52 good bits as a starting point, three iterations will do the job for values up to a googol.

On the other hand, coming up with some way to use the semilattice search trees to just search on one value (and only store the maximum value searched for as the meet!), with the rest of the node along for the ride, would save time and memory and be more generally applicable, and would help me with Haskell in general. That's the way to go.

A flashback and analogy

You've probably heard about how the notion of sum types (e.g. Algol 68 union s, Rust enum s, Haskell type s) and product types (e.g. tup...