Sunday, August 06, 2017

a longest path problem

fivethirtyeight.com has a weekly column, "The Riddler". The Riddler poses two problems in each column. The first, "Riddler Express", is intended to be an easier problem that one can solve quickly, while the second, "Riddler Classic", is more difficult. Modulo vacations, it appears each Friday, and if you submit a solution by the end of the following Sunday (Eastern Time) , you will be among those who might be given credit for the solution in the next column.

The August 28th column's Classic problem is as follows: what's the longest sequence of integers x[i] for 1 ≤  i ≤  n such that
  • for all i, 1 ≤ x[i] ≤ 100
  • for all i < n, either x[i+1] is a multiple of x[i] or x[i+1] is a factor of x[i]
  • for all i, j, x[i] = x[j] iff i = j, i.e. the x[i] are all distinct
(The last constraint avoids trivial sequences like 2, 4, 2, 4, 2, 4... which can go on forever if repeats are permitted.)

One way to characterize this problem is to look at it as a graph with a hundred vertices v[i]. Label v[i] with i, and connect v[i] and v[j], for i ≠ j, with an edge if either i is a multiple of j or j is a multiple of i. Then the problem asks for the longest path through some (or all, if it proves possible!) of the nodes of the graph following the edges with no node visited more than once. (If you solve the one-hundred node problem, the Riddler suggests you try the analogous problem with a thousand nodes.)

Alas, longest path is NP-hard. If you come up with an efficient way to solve this, your name will go down in computational complexity history. So I don't feel too bad starting out with a brute force depth-first search, and since I'm working to improve my Python skills, I start in Python.

The Riddler reports that one person ground through 30 million trials and came up with a sequence of length 55, which makes me suspect that our programs are similar. I started out with the obvious code using Python's built-in set and list types, and after over ten hours, that version had only managed to find a sequence of length 54. Then I started rewriting.
  • Step One: switch from the Python set type to using an integer (Python 3 has arbitrary precision integers, so yay). That did speed things up.
  • Step Two: rather than making lists come and go, accumulate the sequences in arrays that the recursive calls build up the sequence in. That sped things up even more.
The result: the length of known best results zips up to about the mid-40s very quickly indeed, but length 55 took at least two hours to appear. We're at four hours plus change, and 56 has yet to be seen. (UPDATE: showed up sometime between four and ten hours. Next version timestamps the output.)

Next stage will be a Go version to avoid interpreter overhead, and then dividing it up to take advantage of multiple cores, though given that the 55-number sequence is still very early on in the sequences starting with 1 (!?), I wonder how much good that will do. Here's that 55-number sequence, by the way:
1, 2, 4, 8, 16, 32, 96, 3, 6, 12, 24, 72, 36, 18, 54, 27, 81, 9, 99, 33, 66, 22, 44, 88, 11, 55, 5, 40, 80, 20, 60, 30, 90, 5, 15, 75, 25, 50, 100, 10, 70, 14, 56, 28, 84, 42, 21, 63, 7, 91, 13, 39, 78, 26, 52
The longest submitted sequences have length 77. More news as it happens. (Might experiment with how to pick the next item to try: the one with the most or least possible successors, perhaps? OTOH, if it were most, you'd always start with 1, and neither of the length 77 sequences the Riddler shows starts with 1.)


No comments:

Riddler Classic, May 23, 2020—Holy Mackerel!

Another one using Peter Norvig's word list . It turns out that the word "mackerel" has a curious property: there is exactly ...