lederhosen | Adventures in programming

I've been doing a course using MiniZinc, which is a specialised language for constraint/optimisation problems. It's a bit different to what I'm used to: it aims to separate the specification of the problem from the solution of the problem. Once you've told it the problem you want it to solve, it translates that into instructions to a solver.

As an example, here's one I wrote on the plane last Friday, to solve the in-flight magazine Sudoku:

include "globals.mzn";
int: box_size=3; % Side length of one of the constraint boxes in a grid
int: grid_size=box_size*box_size; % Total side length of the grid e.g. 3x3=9
set of int: Rows = 1..grid_size; % i.e. rows have index values 1 through grid_size
set of int: Cols = 1..grid_size;
array[Rows,Cols] of var 1..grid_size: grid_solved; % This is the solution we're trying to find
array[Rows,Cols] of 0..grid_size: grid_start; % Clues, with 0 = blank

% Set the standard constraints:
constraint forall(i in Rows)(alldifferent([grid_solved[i,j]|j in Cols]));
constraint forall(j in Cols)(alldifferent([grid_solved[i,j]|i in Rows]));
constraint forall(k,l in 0..box_size-1)(alldifferent([grid_solved[i+k*box_size,j+l*box_size]|i,j in 1..box_size]));
% Require that the solution matches clues
constraint forall(i in Rows, j in Cols)(grid_start[i,j]>0->grid_solved[i,j]=grid_start[i,j]);
% Tell MiniZinc that we just want a solution that satisfies these
% requirements (i.e. we're not trying to optimise anything)
solve satisfy;

% Define the clues - picked this one from
% http://www.telegraph.co.uk/news/science/science-news/9359579/Worlds-hardest-sudoku-can-you-crack-it.html
grid_start=
[|8,0,0,0,0,0,0,0,0
|0,0,3,6,0,0,0,0,0
|0,7,0,0,9,0,2,0,0
|0,5,0,0,0,7,0,0,0
|0,0,0,0,4,5,7,0,0
|0,0,0,1,0,0,0,3,0
|0,0,1,0,0,0,0,6,8
|0,0,8,5,0,0,0,1,0
|0,9,0,0,0,0,4,0,0|];

% And output the solution.
output
[ show(grid_solved[i,j])++
if j == grid_size then "\n" else " " endif
| i in Rows, j in Cols ]
;

Using the default solver that came bundled with MiniZinc (Gecode), this finds a solution in about 30-50 milliseconds.

Note that I didn't tell it how to solve the puzzle; I just told it the rules that a successful solution must obey, and MiniZinc/Gecode worked out the rest on their own. I'm sure this is old hat to some of you, but for me this is pretty impressive.

For some more complex problems, it is necessary to give the solver a bit of guidance on what strategy to use, but even there it keeps the focus on defining what the problem is, and it lets me switch from one solver to another without changing my code. I can see this being useful.

Flat | Top-Level Comments Only

That sounds interesting, but how do you know it's not just iteratively guessing random numbers until it finds a solution that matches the constraints?

Don't get me wrong; A programmable Monte Carlo solver is still useful, just not as cool.

-- Pat Scaramuzza

Several reasons, but an easy test is performance.

Test code:

array[1..1000] of var 1..1000: solution;
constraint forall(i in 1..998)(solution[i]<solution[i+2]);
constraint solution[999]<solution[2];
solve satisfy;

i.e.: create an array "solution" of length 1000, populated by integers between 1 and 1000, such that solution[i] < solution[i+2] and solution[999] < solution[2].

There's only one solution to this: [1,501,2,502,...]

MiniZinc/Gecode takes about 60 msec to find that solution. From previous experience there's about 30 msec of overhead in calling MZ, so the search time is around 30 msec.

There are 1000^1000 possible ways to populate that array, of which only one satisfies the constraints, so evidently Gecode isn't just rolling the dice until it finds a solution that works.

Even a naive search strategy that starts by guessing a value for solution[1], then solution[2], and so on, testing constraints at every step, is going to run forever because it takes waaaay too long to discover that a bad choice at the start will trip you up later. For example, if you start with solution[1]=1 & solution[2]=500, you can get very nearly to the end of the array going [1,500,2,501,...] before discovering that there's no legal option for entry 999.

My understanding is that among other things, MZ/Gecode are smart enough to use the supplied constraints to build relationships between the variables. If you know solution[i] is in 1..1000 and solution[3]>solution[1], then you can establish that solution[3] must be at least 2 and solution[1] is at most 999. The constraint that solution[5]>solution[3] then lets you restrict range for solution[1] to 1..998 and solution[5] to 3..1000, and so on.

If you build a problem complex enough, then you'll reach a point where it can't be solved purely by this kind of logic, and trial-and-error is necessary. (Or, sometimes it's easy to find a solution, but you're looking for the best solution under some metric.)

At that point, random guessing among the not-yet-eliminated values is one option for the search strategy, and sometimes it's effective - I found it useful in one of the assignment questions. But there are others. Choosing the best one is the tricky part!

Adventures in programming - MiniZinc

no subject

no subject