### Answers to yesterday's puzzle

Mar. 15th, 2012 12:06 pm**lederhosen**

Answers to yesterday's pi riddle behind the cut.

This is an easy and straightforward question: there are 100 such intervals, their total length is 1, therefore the average length is always going to be exactly 1/100.* I only asked this question in order to make the next one harder.

This is the tricky one.

First off, you may have figured out that there's nothing special about 1/pi in this problem. (There is a different problem, also relating to intervals and probability, where 1/pi plays an important role, but it ain't this one.) I could just as easily have specified 0.1, 0.333, or any other number between 0 and 1, and the answer would have been exactly the same.

So, given that there's nothing special about 1/pi... doesn't that mean the answer should be the same as for #1 above?

Well, let's come at this problem from a different angle:

Because we select our points randomly, some of the resulting intervals will be shorter than 1/100 and some will be longer. They'll average out to 1/100 but there'll be a lot of variation around that.

(If you want an exact formula for the variation, look at the beta distribution function.)

So let's look at a couple of different intervals that might occur in this process. Suppose I tell you that Interval A has length 0.005, Interval B has length 0.015, but I don't tell you where they lie in [0,1] (with the option to wrap at the ends).

What is the probability that Interval A contains 1/pi?

What is the probability that Interval B contains 1/pi?

Well, Interval A occupies 0.5% of the total range, so for

In other words, larger intervals are more likely than smaller ones to contain 1/pi (or any other number you might prefer between 0 and 1).

Reversing that, via Bayes' Theorem: the intervals that contain 1/pi are on average larger than the intervals that don't.

To get the exact answer, you can either work through it with beta distributions and Bayes' Theorem, or you can take a simpler approach:

Consider your random variables ri, from i=1 to i=100.

Now create a new set of random variables:

s1=r1+t where t is another random variable (uniform distribution between 0 and 1, independent of all ri)

s2=r2+t (same t)

...

s100 = r100 + t

s101 = 1/pi + t

In all cases, if one of these numbers exceeds 1, wrap around again (which is to say: "we're working in modulo 1 throughout".)

For any given i, let Li denote "the distance between the random number si and the first sj that comes after it" (i.e. its next neighbour, moving in a positive direction).

Three observations about {s1-s101}:

- they are all uniformly distributed

- they are all independent of one another

- they break the line of length 1 into 101 intervals

From these, it follows that the average value of Li is 1 for all i - including i=101.

By subtracting off our variable t, we now see that (on average) the first ri above 1/pi is at distance 1/101 above it. By symmetry, the first ri

This is a nifty little result, and a reminder to watch out for conditional probability: when you have several random variables and you choose one of them in a way that is affected by their values, you change the distribution of the resulting variable.

*No Banach-Tarski tricks here, since each of the intervals is itself measurable.

*(1) What is the average length of one of these intervals?*This is an easy and straightforward question: there are 100 such intervals, their total length is 1, therefore the average length is always going to be exactly 1/100.* I only asked this question in order to make the next one harder.

*(2) What is the average length of the interval that contains 1/pi?*This is the tricky one.

First off, you may have figured out that there's nothing special about 1/pi in this problem. (There is a different problem, also relating to intervals and probability, where 1/pi plays an important role, but it ain't this one.) I could just as easily have specified 0.1, 0.333, or any other number between 0 and 1, and the answer would have been exactly the same.

So, given that there's nothing special about 1/pi... doesn't that mean the answer should be the same as for #1 above?

Well, let's come at this problem from a different angle:

Because we select our points randomly, some of the resulting intervals will be shorter than 1/100 and some will be longer. They'll average out to 1/100 but there'll be a lot of variation around that.

(If you want an exact formula for the variation, look at the beta distribution function.)

So let's look at a couple of different intervals that might occur in this process. Suppose I tell you that Interval A has length 0.005, Interval B has length 0.015, but I don't tell you where they lie in [0,1] (with the option to wrap at the ends).

What is the probability that Interval A contains 1/pi?

What is the probability that Interval B contains 1/pi?

Well, Interval A occupies 0.5% of the total range, so for

*any*specific number of our choice, there's a 0.5% probability that it will be covered by A. On the other hand, it has a 1.5% probability of being covered by B.In other words, larger intervals are more likely than smaller ones to contain 1/pi (or any other number you might prefer between 0 and 1).

Reversing that, via Bayes' Theorem: the intervals that contain 1/pi are on average larger than the intervals that don't.

To get the exact answer, you can either work through it with beta distributions and Bayes' Theorem, or you can take a simpler approach:

Consider your random variables ri, from i=1 to i=100.

Now create a new set of random variables:

s1=r1+t where t is another random variable (uniform distribution between 0 and 1, independent of all ri)

s2=r2+t (same t)

...

s100 = r100 + t

s101 = 1/pi + t

In all cases, if one of these numbers exceeds 1, wrap around again (which is to say: "we're working in modulo 1 throughout".)

For any given i, let Li denote "the distance between the random number si and the first sj that comes after it" (i.e. its next neighbour, moving in a positive direction).

Three observations about {s1-s101}:

- they are all uniformly distributed

- they are all independent of one another

- they break the line of length 1 into 101 intervals

From these, it follows that the average value of Li is 1 for all i - including i=101.

By subtracting off our variable t, we now see that (on average) the first ri above 1/pi is at distance 1/101 above it. By symmetry, the first ri

*below*1/pi is on average 1/101 below it - so on average, the interval containing 1/pi has length 2/101, while those intervals NOT containing 1/pi average length 1/101.This is a nifty little result, and a reminder to watch out for conditional probability: when you have several random variables and you choose one of them in a way that is affected by their values, you change the distribution of the resulting variable.

*No Banach-Tarski tricks here, since each of the intervals is itself measurable.