lederhosen: (Default)
Now and then Rey and I do play readings with friends. Usually there are rather more roles than there are readers, so "one man in his time plays many parts", which works fine until you end up playing two roles in the same scene and having to have an extended conversation with yourself.

So you want to cast roles in a way that avoids that kind of overlap, and you probably also want to make sure the different readers each get a decent share of the lines. You could do this by hand, but since I'm currently teaching myself AMPL, I thought it'd be a fun challenge to program a solution.

AMPL (A Mathematical Programming Language) is similar to MiniZinc, which I posted about a while back: it's designed for specifying optimisation/constraint problems and then passing them to a solver of one's choice.

It's very much a declarative language: instead of giving the computer a set of steps to follow, you give it a set of requirements and then let it figure out how to satisfy those requirements. (This still feels like magic to me.)

AMPL and other optimisation languages usually take input in two parts: a "model" which is a generic description of the problem and requirements, and "data" which defines a specific instance of the problem.

So, here's some AMPL code:

The model )
The data: )

In the unlikely event that anybody other than me actually wants to use this, you can download a free demo from AMPL (unlimited duration, restricts to about 300 variables i.e. number of actors x number of parts should be less than 300).

The demo comes bundled with a selection of top-notch and open-source commercial solvers, all free to use subject to that size restriction. By default it uses the MINOS solver, which is nice for generic nonlinear problems but doesn't handle integer constraints; since those are important here you'll want to use "options solver gurobi" (or cplex or xpress).
lederhosen: (Default)
I've been doing a course using MiniZinc, which is a specialised language for constraint/optimisation problems. It's a bit different to what I'm used to: it aims to separate the specification of the problem from the solution of the problem. Once you've told it the problem you want it to solve, it translates that into instructions to a solver.

As an example, here's one I wrote on the plane last Friday, to solve the in-flight magazine Sudoku:

include "globals.mzn";
int: box_size=3; % Side length of one of the constraint boxes in a grid
int: grid_size=box_size*box_size; % Total side length of the grid e.g. 3x3=9
set of int: Rows = 1..grid_size; % i.e. rows have index values 1 through grid_size
set of int: Cols = 1..grid_size;
array[Rows,Cols] of var 1..grid_size: grid_solved; % This is the solution we're trying to find
array[Rows,Cols] of 0..grid_size: grid_start; % Clues, with 0 = blank

% Set the standard constraints:
constraint forall(i in Rows)(alldifferent([grid_solved[i,j]|j in Cols]));
constraint forall(j in Cols)(alldifferent([grid_solved[i,j]|i in Rows]));
constraint forall(k,l in 0..box_size-1)(alldifferent([grid_solved[i+k*box_size,j+l*box_size]|i,j in 1..box_size]));
% Require that the solution matches clues
constraint forall(i in Rows, j in Cols)(grid_start[i,j]>0->grid_solved[i,j]=grid_start[i,j]);
% Tell MiniZinc that we just want a solution that satisfies these
% requirements (i.e. we're not trying to optimise anything)
solve satisfy;

% Define the clues - picked this one from
% http://www.telegraph.co.uk/news/science/science-news/9359579/Worlds-hardest-sudoku-can-you-crack-it.html

% And output the solution.
[ show(grid_solved[i,j])++
if j == grid_size then "\n" else " " endif
| i in Rows, j in Cols ]

Using the default solver that came bundled with MiniZinc (Gecode), this finds a solution in about 30-50 milliseconds.

Note that I didn't tell it how to solve the puzzle; I just told it the rules that a successful solution must obey, and MiniZinc/Gecode worked out the rest on their own. I'm sure this is old hat to some of you, but for me this is pretty impressive.

For some more complex problems, it is necessary to give the solver a bit of guidance on what strategy to use, but even there it keeps the focus on defining what the problem is, and it lets me switch from one solver to another without changing my code. I can see this being useful.
lederhosen: (Default)
Dear story problem people: I want to know what plane you have that can cruise at 40km.

Also, if you're asking a question about rolling "two fair dice", it would probably be a good idea not to accompany it with a stock photo that shows dice with a 6 on EVERY FACE.


The probability of a boy having blue eyes is 4/11 and blond hair is 2/7 from a group of students. A boy is chosen at random from that group. What is the probability the boy is blue-eyed and has blond hair?

...I don't actually know what the answer is, but I'm pretty sure it's not the 8/77 that you're looking for there.
lederhosen: (Default)
Managed to fix* the programming problem for Tech Apps and got a very grateful phone call - apparently they'd been wrestling with this for weeks. I was coasting on a wave of "I AM A TECH GOD" until I remembered that I'd just spent six weeks procrastinating on getting my "broken" headphones replaced, before realising that I'd plugged them into the wrong port. So yeah.

Then almost called helpdesk to report a broken monitor before figuring out that if I unplugged it and replugged it, the problem went away. At least I did figure this out BEFORE calling them.

*FCVO "fix". TLDR version: they changed their code and couldn't figure out why the new results were slightly different from the old one. After staring at ~150 pages of code for two days and looking up the mathematical technique involved I was able to tell them: "probably not a bug as such". The old code calculates an approximation; the new one calculates an approximation for the same data via slightly different methods, so there's no guarantee that it'd give the same results.

Problem is that figuring this out requires (a) understanding of the maths involved and (b) familiarity with IML, which isn't exactly a Top Ten programming language.
lederhosen: (Default)
"Hi, this is **** from Tech Services. Do you know anything about IML? We can't figure out why our program isn't working like it should." [And no, not a program that I was involved in writing.]

...come to think of it, my supervisor is from Soviet Russia. Hmm.
lederhosen: (Default)
Talked to my editor today. He used the words "cursed book" and "nightmarish" - apparently they have had a few technical and organisational issues with this one o.O

Also, "shall we just say there have been some problems with [author who wrote Chapter 1]". I'm told this author didn't write any other chapters, so I'm HOPING the others will be better... will look at them tomorrow and find out. Certainly most of the problems felt like an author issue.

Well, I feel better knowing I wasn't imagining it.


Jun. 22nd, 2012 10:53 am
lederhosen: (Default)
Good news: have been offered another maths book to answer-check.

Better news: as of this project, the rate they pay per-page has gone up by 67%. Which means not all of it will go on tax and paying off overdue bills.

(Cheers again to Loki, who put me onto this gig a few years back.)

...hmm. When my current laptop conks out, I guess that means I can actually deduct part of the cost of a replacement?


Mar. 1st, 2012 06:36 pm
lederhosen: (Default)
I recently moved sections at work, so I have a new supervisor. One of the fun things about this is getting to hear more of her idiomatic English.

After previewing the office where we may be moving for two months:

"I vas like "OH MY GOD" - and if I vas like "OH MY GOD", and I grow up in Moscow, then all you will be like three times "OH MY GOD"!"
lederhosen: (Default)
Sometimes people write computer code that they don't expect anybody else will ever see, for a one-off project.

Sometimes that project turns out not to be a one-off after all, and I have occasion to read that code...

Imagine, if you will, you're just looking out your window. And your neighbour has left his curtains open. And he's dancing naked in his lounge room, singing to his dog, because he has no idea that anybody else can see him.

This insight brought to you by reading a program where the temporary files are called things like DATA_YEAH.

(AFAIK, the program works. But I will be wearing my metaphorical pants when I code, all the same.)
lederhosen: (Default)
Basil seems to be getting used to the place. He still spends a lot of his time playing "invisible dog" hiding in little niches around the house (quite the opposite of Dog-Or) but when I walk into the room he'll come out and say hello, and accept pats and scritchies for as long as I'm willing to give them. He seems to have hurt his foot somehow, so he may be in for a vet trip tomorrow.

Dog-Or is learning to live with him. He's been an only dog for eleven years, so I can't blame him for getting a bit jealous, but I think he'll get there. They both love w a l k s and that should help them bond a bit.

Work stuff. )
lederhosen: (Default)
Spent a large chunk of today on the phone to one of our SAS support guys discussing a problem I've been having with one of my programs*. Ended up sending him the program + logs so he could look through it for himself, and he actually called it a "work of art". Go me!

He might have been thinking Picasso, or possibly Escher's "Ascending And Descending" given how long the wretched thing takes to run, but I shall take it as a compliment.

I know I'm not a real programmer, but in the course of developing some SAS training courses I've learned a little bit along the way... also, developed a sense of "if I am telling newbies how to write programs I probably ought to take some of my own advice" type guilt, which is a powerful motivator to plan, document, and validate.

The stupid thing is that I'm pretty sure this program was working two weeks ago; the thing that's been giving me all this grief is the validation step that confirms that it works in practice as well as in theory. Apparently non-optimisable SQL joins are a bad thing when the tables involved each have about a million entries... but I have wrestled it down to the point where it only takes about 10 hours to run and about a gig of memory!**

*I am about 95% sure the cause of the problem is "some other process, probably their automated backups, decided to put a lock on one of my files at the same time my program needed to write to it", but they wanted to explore other options.
**Which would be why I end up running it at night, at the same time the automated backups happen...

Maths stuff

Aug. 4th, 2010 06:33 pm
lederhosen: (Default)
Why did nobody tell me Kate Bush had recorded a song about my father?

Have been doing a training course with James Brown for the last two days. He is rather less soulful and more mathematical than his namesake. I have to say, until yesterday I don't think I'd ever heard the f-word used repeatedly in a maths lecture, but it was in a cheerfully enthusiastic sort of way.
lederhosen: (Default)
Co-worker: "Hey, you know that tip you sent out a couple of months back about LAG and DIF? That just saved me a day's work."

(I really, really disliked SAS when I started using it two years ago. I've come to the realisation that it's actually not so bad a language for certain purposes, just that it does a pretty good job of obfuscating its good points. I have an ever-growing library of useful tricks I discovered while looking for something completely unrelated...)
lederhosen: (Default)
"This behaviour is intentional. Also, we fixed it in the next release." - SAS tech support guy in response to my bug report.

(Am coming to the conclusion that SAS EG works really well if you know exactly what your program needs to do and can debug each step perfectly before you create the next one, and... not so well otherwise.)
lederhosen: (Default)
Via [livejournal.com profile] silverblue, http://daaaamn.com/ makes me happy (even if its very existence invalidates its own data).

Meanwhile, I have figured out how to visualise interviewer workloads, and it looks rather like a game of Tetris. This wasn't my original intention, but when I realised it was turning out that way, I didn't exactly fight it...


Apr. 4th, 2009 09:28 am
lederhosen: (Default)
Last week: co-worker needed to join two datasets (staff experience to calls made by staff). Unfortunately the variable that SHOULD be the match variable (employee number) isn't 100% reliable - some people have numbers beginning '7595...' on one file and '7959...' on the other, that sort of thing. The other possible matching variable is name, but that's not 100% reliable either - there are people who are 'Anne' on one file and 'Patricia' on the other (legal name vs preferred name), variant spellings, name changes due to marriage and so on. We also have two employees with the exact same name.

So I set up a two-stage join: match by ID number, find unmatched records, match those by name, recombine the two files, etc etc. Messy and complicated, took quite some time to write and debug.

Then the other day, while poking around looking for something else, I discovered that I could have done it much more elegantly in a couple of lines. I'm using SAS EG, which has a point-and-click interface that allows SQL joins on equality etc; while I know how to write those in SQL for myself, I hadn't realised that I could also write a few things that weren't in the point-and-click options*. The trick is to write a fuzzy-logic join:

where (
(a.emp_id=b.emp_id)*3 >= &match_tolerance)

In practice, probably slightly fiddlier than that, but that's the basic idea. Wish I'd known earlier that I could do that.

Oh well, probably not the last time I'll need to do that, and I'm sure some of my co-workers can use this trick too. I'm trying to encourage them to shift to SQL joins instead of the SAS merge operation, because it seems to cause problems for anything other than one-to-one joins.

*My knowledge of SQL and SAS code is 'what I've picked up along the job'; while I know quite a few tricks, without formal training it's easy to miss important basics, as here.
lederhosen: (Default)
Back when I was a postgrad student, my reaction to a high R2 value* was "Yay, it works!" Now it's more like "It's a trick. Get an axe."

(Seriously, I am pleasantly surprised when admin data shows any sort of correlation to theory at all. Predicting admin data with R2=0.85 from a simple rule of thumb? That's just unsettling.)

*Translation: the data follows a nice straight line.
lederhosen: (Default)
10pm here, and it's still 32C. I'm not finding a lot of consolation in the knowledge that Melbourne and Adelaide are much hotter... hopefully they'll cool down a bit by the time I join Rey in Melbourne, in March. It's quieter here without her, and Dog-Or is looking hopefully out the window as I type (although mostly he just looks hot and nudges me for cuddles/food/ball).

Gradually settling into the new workgroup; the old group finally let go of me at the end of the year. I've been fielding a few questions, but my replacement seems to have got a good handle on things. I still check the weekly status reports to see how things are going; I feel oddly protective of that project. (The good news is that the after months of panic and alarm, the status reports are miraculously catching up to where they're supposed to be; the problem seems to be that progress data wasn't being input quickly enough.)

New project, somewhat related, is operations research - trying to figure out how much time/money/manpower we actually need to run a survey. I grumbled about some of this recently so I won't repeat myself on the subject of the code; at the moment I'm trying to decide whether my time would be better spent in finessing the modelling, or fixing dodgy input data.

Thanks to [livejournal.com profile] quatranoctal, I have now seen 'The Trial of the Incredible Hulk'. Which is rather misleadingly labelled, because neither the Hulk nor Dr. Banner actually, y'know, go on trial. (There is a brief trial scene, where the Hulk gets all angry and smashes up the courtroom, but then he wakes up and it was All A Dream and instead he escapes from prison pre-trial. And his lawyer, who is secretly Daredevil, covers for him, which I'm sure violates some code of ethics somewhere. And somewhere in there there's a touching scene between the two of them that I'm trying very hard not to see as homoerotic. And the "I'm a person, a person with rights!" speech comes pretty close to "This man is dead... murdered... and someone's responsible!" for quality movie speechmaking. But overall, it actually wasn't that bad.)

...it's a hive of activity here :-)
lederhosen: (Default)
"He uses statistics as a drunken man uses lampposts - for support rather than illumination." - attributed to Andrew Lang, among others.

After a year in this job, and two out of the old one, it *still* feels odd going to meetings and discovering that people actually want illumination. But nice-odd :-)


lederhosen: (Default)

July 2017

2324252627 2829


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 19th, 2017 08:11 pm
Powered by Dreamwidth Studios