Sunday, July 20, 2014

Flashback, Part Two

Last time we found that our program to count ways to make change, while much better than the engineering class's brute force nested DO loops, isn't up to snuff.

There is a better way, it turns out. If you look at the Rosetta Code entry for this problem, which they call "Count the Coins", you'll find a number of approaches. For Haskell, there are two. The first "naive" program is essentially what we wrote (save that there's no explicit case for a one-element list; it's a little less efficient, but the recalculation may well swamp that difference).

The other Haskell program takes a different approach:

count = foldr addCoin (1:repeat 0)
        where addCoin c oldlist = newlist
              where newlist = (take c oldlist) ++ zipWith (+) newlist (drop c oldlist)

and if you look at the main for the program,

main = do
    print (count [25,10,5,1] !! 100)

    print (count [100,50,25,10,5,1] !! 100000)

you see that count returns a list that one indexes by the amount one wants to know how many ways to make change for (remember that the head of a list has index 0), reminiscent of the elegant Haskell Fibonacci sequence definition

fibs = 0 : 1 : (zipWith (+) fibs (tail fibs))

If you try it, you'll find it runs a lot faster than our program.

Consider the base case, no denominations, and then how to add a denomination. With no denominations, the only amount you can make change for is zero, so if count is correct it should hand back an infinite list whose head is 1 and all the other elements zero. That's exactly the

1 : repeat 0

that is the starting value for the fold.

So, suppose we have the list that corresponds to the first n denominations. How do we use that to generate the list for the first n + 1?

Well... if there are w ways to make change for c cents (just to have a name, we'll call the smallest denomination "cents") using the first n denominations, and the (n+1)st denomination is d cents, then there are w more ways to make change for c + i * d cents, for i in [1..] when you add the new denomination. So the new list is the current list plus the current list shifted right d places plus the current list shifted right 2 * d places plus... etc. That does look like what addCoin does. (Apologies for picking names that clash with addCoin; what they call "c", we're using "d" for.) Looking like, though, isn't proof. Back when I have proof.... and why foldr  rather than foldl?

Saturday, July 19, 2014

Flashback

Back in the Cretaceous era I worked at the University of Oklahoma as a student assistant at Remote 1. OU was a [shudder] IBM big iron shop at the time, and beginning students typically got a job class that limited CPU time to thirty seconds, lest a beginner submit a program with an endless loop and chew up serious resources. (And when the computer everyone's sharing is a 370 of mid-1970s vintage, it doesn't take much to be considered serious resources. According to the Hercules FAQ, a Celeron 300 can emulate a 370 at speeds greater than a 3033, which was at least twice as fast as what OU had at the time.)

One day one of the instructors assigned his charges the following problem: count the number of ways to make change for a dollar assuming you have a generous supply of half dollars, quarters, dimes, nickels, and pennies. (The language is a bit loose; there wasn't any buying going on, just figuring out how to get coinage adding up to a dollar.) They all headed off to punch their FORTRAN programs to run using WATFIV, successor to WATFOR, the University of Waterloo FORTRAN compiler designed to give good error messages rather than worrying about optimization.

The student assistants were getting swamped with students coming in and wanting to know how to get more computer time; their change-making programs were running thirty seconds and getting killed. Every one of them had written something like this, modulo the length of that logical IF; I'm not going to bother to churn out the appropriate "continuation card"):

       IWAYS = 0
       DO 100 ICENTS = 0,100
         DO 200 INICKELS = 0,20
           DO 300 IDIMES = 0,10
             DO 400 IQUARTERS = 0,4
               DO 500 IHALFS = 0,2
                  IF (ICENTS + 5 * INICKELS + 10 * IDIMES + 25 * IQUARTERS + 50 * IHALFS .EQ. 100) IWAYS = IWAYS + 1
 500           CONTINUE
 400         CONTINUE
 300       CONTINUE
 200     CONTINUE
 100   CONTINUE
       WRITE(6,600) IWAYS
 600   FORMAT(I8,' WAYS')
       STOP
       END
       
so that no attention was paid to how much you'd accumulated in outer loops--no matter whether ICENTS was 0 or 99, it would doggedly try counts of other coins sure to push the total far over a dollar.

I wrote a recursive program in Algol W, my then language of choice, with a function that took an arbitrary array of coin denominations and amount and a main program passing the particular values for the problem the instructor set. It ran in 0.01 seconds. A friend and coworker, Raymond Schlecht (alevasholem), wrote a FORTRAN version that took less than 0.01 seconds, so from the output, which only gave run time to two decimal places, it looked like it took no time at all. We handed the listings and output to our boss, who had a chat with the instructor. I like to think the discussion was educational.

Nowadays, of course, just about everyone has computers so much faster than that 370/158 that the above brute force code would run to completion in no time at all. The old fogey in me hopes that the up and coming programmers serve time with low-end hardware, and indeed systems like Arduino mean that does happen at least some of the time. OTOH, another way to do it is to crank the input size so you still can't get away with an inefficient method, vide the small and large inputs for Code Jam problems... and we'll see that shortly as I get my comeuppance.

So, how to do it in Haskell?

waysToMakeChange :: [Int] -> Int -> Int

Obviously there's exactly one way to make change for 0.

waysToMakeChange _ 0  = 1

For a nonzero amount, you need at least one denomination!

waysToMakeChange [] _ = 0

With just one denomination, either you can do or do not, as Yoda would say.

waysToMakeChange [denom] amount
    | amount `mod` denom == 0 = 1
    | otherwise               = 0

With more than one, consider the first one. The number should be the sum of the ways to make change for the amount minus 0, 1, 2, ... of the coins of that denomination using the remaining denominations as long as the difference is non-negative.

waysToMakeChange (denom:denoms) amount
    = sum $ map (waysToMakeChange denoms) [amount, amount - denom .. 0]

That should do the trick... shouldn't it?

*Main> waysToMakeChange [50, 25, 10, 5, 1] 100
292

Do we believe that? Turns out that Rosetta Code has the same problem, only their example doesn't bother with half-dollars. The result they get for that variation is 242, and sure enough, that's what we get as well.

For completeness's sake, the Rosetta Code page shows an example with considerably larger amounts, for us it would be

waysToMakeChange [100, 50, 25, 10, 5, 1] 100000

Let's try it. (Given the output shown on Rosetta Code, we have to switch over to Integer for the result.) Now we're in the same boat as the engineering students! Even compiled, ten minutes isn't enough given the above implementation.

We need a better method. Why? This version of the code is doing a lot of recalculation. For example, if we have a half-dollar and a quarter, or three quarters, or two quarters, two dimes, and a nickel, or... we will keep evaluating, among other things, the number of ways to make change for 25 cents just using pennies. So while we have a nice recursive function that is pretty clearly correct, it's inefficient. We tried one dollar, ten dollars, and one hundred dollars, running time on the compiled program, and looking at the user line of the time output, we see

one dollar:                 0.002 s
ten dollars:                 0.009 s
one hundred dollars:  6.794 s

Definitely the larger problem isn't practical this way... but this is already long enough and has been sitting for months waiting to be published, so let's make this Part One. Stay tuned.

Monday, July 14, 2014

2010 Code Jam problem: "Store Credit"

A fellow on the haskell-beginners mailing list asked for suggestions on how to improve code he'd written to solve a 2010 Code Jam qualifying problem, "Store Credit". Here's the problem statement:
"You receive a credit C at a local store and would like to buy two items. You first walk through the store and create a list L of all available items. From this list you would like to buy two items that add up to the entire value of the credit. The solution you provide will consist of the two integers indicating the positions of the items in your list (smaller number first)."
Then the format of the input is described, and among the things we learn are:
  • The prices are all integers.
  • "Each test case will have exactly one solution."
But then one of the examples for which a solution is shown has a credit amount of 8 and this list of prices:

2 1 9 4 4 56 90 3

and the solution "4 5". At first one would--well, I did--think that violates the promise of only one solution, since two of item 4 or two of item 5 would also add up to the credit amount of 8. I would say that the problem statement is at least a little misleading, and should have said "...would like to buy two different items that add up to the entire value of the credit." It's arguably implied by the parenthetical "(smaller number first)", and perhaps the intent is to show the need to read specs carefully.

Clearly you want to keep around two items, or for a functional version, you want to generate two items as you trudge down the list of prices:
  • an array or vector of integers with C/2 elements
  • a mapping that, given a price, returns the position (or positions, given the one example) of the items with that price
The array/vector starts out as all zeroes, and for each item, you increment the element whose index is the minimum of the price and C - the price. (Ignore anything with price > C.) You've found a solution when the result of the increment is 2.

It's pretty clear how to implement this in an imperative language. We'll see what I come up with for a Haskell version.

Wednesday, June 18, 2014

Look and say sequence

If I don't already have the Haskell subreddit link over on the right, I'll add it ASAP.

This evening a Haskell beginner posted about some trouble he was having writing code to generate a particular sequence. I didn't catch on to the sequence he was going for, but I should have from a comment in his code:

enunBlock :: [Int] -> [Int] -- [2,2,2] -> [3,2] | [3] -> [1,3]  

Someone did catch on, though, and asked "Are you trying to make a look and say sequence?" The poster said yes... and off to Wikipedia I went.

Said sequence starts 1, 11, 21, 1211, 111221, ... and the way you get the next term is to take the current one and sort of run-length encode it. The first term would be "one one", i.e. a run of ones of length one, so the second term is 11. That in turn would be described as "two ones", hence the third term is 21, or "one two, one one", giving 1211, and so on.

If each digit were on its own line, you could get the next term of the sequence by piping it through uniq -c and playing some sed games to pull out some whitespace. (Hey, wait; will we ever have a run length needing two digits? Actually, you can bound it even more tightly than that, unless you pick a starting value that forces the issue. The result, and a reference, is in the Wikipedia article.)

We'll go along with the poster--as you can see from that declaration, he's using a [Int] for a term of the sequence, saving some hassle. If we can just write a function that takes a term and generates the next, then, if you've read much here or done much Haskell at all, then you know where we're headed for the sequence, namely iterate.

lookAndSay :: [Int] -> [[Int]]
lookAndSay xs = iterate nextTerm xs
     where nextTerm xs = ...

So, how to define nextTerm (that's what we're calling what the original poster called enunBlock)? Clearly our base case is

nextTerm [] = []

the more interesting one is

nextTerm (x:xs) = ....

and for it we're going to take advantage of span from Data.List:

nextTerm (x:xs) = length run  + 1 : x : (nextTerm leftovers)
      where (run, leftovers) = span (== x)  xs

put it all together after a pass through hlint, and you have

import Data.List

lookAndSay :: [Int] -> [[Int]]
lookAndSay = iterate nextTerm
    where nextTerm []     = []
          nextTerm (x:xs) = length run + 1 : x : nextTerm leftovers
                            where (run, leftovers) = span (== x) xs


UPDATE: Aha! Should've looked closer at what the poster was trying to do. He or she was looking to use group which takes a list and chops it up into a list of lists made up of the runs of equal elements; for example, if you hand it "balloon", you get back ["b", "a", "ll", "oo", "n"]. That does come closer to what we want, but we need the counts and to get rid of the duplicates where they exist, something like

nextTerm xs = concat [[length ys, head ys] | ys <- group xs]

or, if you're feeling pointless--er, pointfree,

nextTerm = concat . (map (\xs -> [length xs, head xs]) . group

and sure enough, that does the trick.

UPDATE: Darn it! You'd think that after having found and used sequence, I'd remember it. Make that

nextTerm = concat . map (sequence [length, head]) . group

and we'll be that much more concise. Ah, even better is

nextTerm = concatMap (sequence [length, head]) . group

and once you have it boiled down this far, why bother giving it a name? Once the smoke clears, the whole thing is just

import Data.List

lookAndSay :: [Int] -> [[Int]]
lookAndSay = iterate $ concatMap (sequence [length, head]) . group

Clearly I need to spend some time just rummaging through the available functions... and now I'll have Cream's "I Feel Free" stuck in my head until I take a stab at a filk, working title "I'm Pointfree".

Sunday, April 13, 2014

Fun with pattern matching

I just noticed that I did something almost without thinking.

I'm still being lazy about dealing with inputs. I haven't gotten around to using one of the many elegant parser packages that Haskell makes possible, and the Code Jam problems I've tried so far haven't required much more than skipping the leading line--kids, ask your grandparents about the card decks they would feed to their FORTRAN programs that started with a card that said how many sets of input values followed!--and handing back a list of values per line of input afterwards.

A particular Cookie Clicker Alpha input has three values, and so I passed cookieTime a list I knew would always have three elements, so that the pattern

cookieTime [farmCost, farmRate, goal] = ...

matches it and also splits out the elements of the list and binds them to names--just the way that introductory Haskell texts always mention that you could make yourself comfortable if you're used to languages that want parameter lists parenthesized by passing tuples, but you shouldn't. I will try to wean myself away from it.

Saturday, April 12, 2014

Well, drat... *spoiler alert*

I didn't do well in the qualifying round of Code Jam this year, alas. Part of it I can thank the IRS for, in a way, because we paid our annual last-minute visit to H&R Block today. But really it's my fault; I didn't arrange to have the interval during which one could enter solutions clear to devote to it.

I didn't get my improved version of Cookie Clicker Alpha done in time to count, but it did check out, so I have a moral victory at least. I probably should have devoted the time needed to do the simplest of the qualifying problems, Magic Trick, rather than just sketching out the solution.

A quick summary of the problem: in the problem's version of the game "Cookie Clicker" (inspired by a real game of that name) is parametrized by three values:
  • X, a goal number of cookies to accumulate
  • C, the cost in cookies of a "cookie farm"
  • F, the rate at which one can "harvest" cookies from a cookie farm once bought
OK, actually, there's a fourth value: if you click on a giant cookie, you can gain two cookies/second. (You have to have a way to get started accumulating cookies, or you couldn't buy a cookie farm.)

One thing to note: in real life, cookies are discrete entities; a whole cookie is either there or not. In this problem, though, you should think of cookies as being produced continuously; for example, with no farms, you can lay claim to two thirds of a cookie in a third of a second.

So, the problem as posed: given X, C, and F, figure out the soonest you can accumulate the goal of X cookies. Of course, cookies you spend on farms don't count towards X, though of course it will usually pay off to buy farms. (Not necessarily, though! If X < C, best to just click away.)

As usual for Code Jam, the input is a line with the number of cases of the problem to solve, followed by lines with the data for the cases. For this one, it's one line per case, with C, F, and X (which are "real" numbers) on the line in that order, separated by spaces. The output format? You should know by now:

Case #<number>: <time>

where <time> is the minimum time to get X cookies with the given cost and rate per cookie farm.

Magic Trick is simple enough that you don't have to worry about optimization the way you do for problems like Fair and Square last year... but Cookie Clicker has two inputs to try, and my initial solution was far too slow to handle the larger input within the time limit.

Spoilers behind the break...

Sunday, April 06, 2014

Letting the compiler do it for you

A while back I posted about how looking at the type can give you hints or even tell you what a function has to do, giving the simple example of function composition and pointing at a video by Matthew Brecknell showing how you can enlist ghci to help you in that respect.

Here's another example, provoked by a comment on a reddit question, namely map. If you only knew

map :: (a -> b) -> [a] -> [b]

could you write map? The signature says "give me a function that maps as to bs and a list of as, and I'll give you back a list of bs". Well, you could be a wise guy and write

map _ _ = []

and if anyone complained, say "Hey, the empty list is a list of bs", but in your heart you'd know you were cheating. The only way you're given to get a b given an a is to use the function, so the obvious thing to write is

map f xs = [f x | x <- xs]

(Yes, you could be a wise guy at the next level up and write

map f xs = reverse [f x | x <- xs]

but again, your conscience should bother you.)

Well, it turns out that at least sometimes, a compiler can do some of that kind of reasoning for you. Not in Haskell (though there's been a proposal), but in languages that support what are called "dependent types". Dependent types depend on a value; the canonical example is parametrizing lists by their length. That lets the language's type checker catch more errors.

"Hey, wait!" you say, "Does that mean I can't write my cool

fibs = 1 : 2 : (zipWith (+) fibs (tail fibs))

to neatly specify the Fibonacci sequence?" Alas, I suspect it does--after all, it doesn't have a finite length. There may be a way around that in this case, but the parametrization means you have to do calculations when type checking: "is this list empty?" "is this list at least as long as that list?" and in the general case, it can be undecidable.

All that said, it's still pretty darned impressive what can be done, and there's an excellent video from David Raymond Christiansen that shows it being done in the language Idris (which recently had a new version released). Check it out.

Wednesday, April 02, 2014

It's in the papers...

I don't necessarily agree with the New York Times's editorial position, but I applaud this fellow.
"Haskell in the Newsroom" is worth your time.

Tuesday, March 25, 2014

IMVU and Haskell

Check out "What it's Like to Use Haskell" over on the IMVU Engineering Blog. The author is enthusiastic, with enthusiasm based on, but also tempered by, the results of real-world use. Thank you, Mr. Friesen.

Sunday, February 23, 2014

Another thing someone's probably done before

In part it comes from still being fairly new to the language, but I think it also has to do with the seriously cool high level of abstraction Haskell lends itself to. When I come up with a useful function, it's already been implemented, and far more generally. The first time around I wrote a special case of sequence. Here's my next function, and I am sure I just have to come up with the right monadic way to think of it to get the right string to hand to Hoogle to find the official version.

maybeIterate :: (a -> Maybe a) -> a -> [a]
maybeIterate f x = x : maybeIterate' (f x)
  where maybeIterate' Nothing   = []
        maybeIterate' (Just fx) = fx : maybeIterate' (f fx)

So in the background I will keep an eye out for blog posts like the one that clued me in to sequence, and I will continue the study of monads.

Saturday, February 22, 2014

Numerical methods and C++

I sometimes have to use C++. What I'm curious about is numerical methods code for C++. Functions for finding zeroes or fixed points of functions, or numerical integration or differentiation, typically take as a parameter (a pointer to) the function being integrated, differentiated, or whatever. Here comes the fun part: what if that function is a member function?

You can't pass a pointer to a member function to something wanting a plain old pointer to function. The C++ FAQ section 33.2 answers the question "How do I pass a pointer-to-member-function to a signal handler, X event callback, system call that starts a thread/task, etc?" with the one word "Don't", followed by a hack to give the effect involving a function that returns the result of calling the member function via a global pointer to an object of the appropriate class that you have to assign to. Note the use of the "G word" there--so you can kiss the hack goodbye if multiple threads try it.

Googling "numerical methods in C++" turns up a fair number of book titles. I hesitate to buy a C++ book for the same reason I scrupulously avoid buying Microsoft products, but I may have to make an exception.

Wednesday, February 05, 2014

NixOS

If you are a Linux user, you probably have heard at least a bit about the package manager/format hoohah. .rpm, .deb, etc.... It's been blamed for dissuading companies from porting their apps to Linux, and one must admit that it's a pain when a package isn't available using the package manager your distribution uses (*cough*Java*cough*).

It's only recently that I've found out about the Nix package manager. I'm somewhat embarrassed, because it's based on a lazy, pure functional language. You can have multiple versions of a package installed at the same time, eliminating what in the Windows world is called "DLL hell". It supports multiple users, so non-privileged users can install packages. (And if two users happen to install the same package, there will only be one instance of it.)

The Nix package manager is the basis for the Linux distribution NixOS. It may not be for everybody; there are programs out there that still have wired-in assumptions about where things live that NixOS doesn't necessarily follow. OTOH, it may well be for me, judging by a blog post titled "How I Develop with Nix". (Do also check out the comments from where a link was posted on Reddit.) In particular, check out the section titled "Profiling", which talks about getting around the very problem that made me swear off cabal, i.e. not having profiling versions of packages.

Think I'll start out with it on a virtual machine, and once I get some more experience, maybe it will become my distro of choice. (Admittedly I have yet to migrate to Sabayon as I've been making noises about...)

Tuesday, January 14, 2014

Getting fed up...

Sigh. First streams, now matrices. The Ubuntu repositories don't have all the Haskell packages one would like. I guess it's going to come to removing the Ubuntu packages and building ghc and associated code myself or via.... [cue ominous music]... cabal.

I may combine that with a switch to Sabayon Linux, which will let me get back into Gentoo with minimal hassle, and also return to Enlightenment.

Once I can get the packages I want, it should be pretty easy to do Project Euler #11; the main trick for me is reading the input, and I think something like this should do the trick:

import qualified Data.Matrix as M
import qualified Data.ByteString.Char8 as B

parseLine :: B.ByteString -> [Int]
parseLine s = case B.parseInt s of
                Nothing      -> []
                Just (n, s') -> n : parseLine s'

main = do
         s <- getContents
         let m = M.fromLists $ map (parseLine . B.pack) (lines s)

and then it's just how to generate the contiguous portions of the matrix so I can find which one has the maximum product.

A flashback and analogy

You've probably heard about how the notion of sum types (e.g. Algol 68 union s, Rust enum s, Haskell type s) and product types (e.g. tup...