100 Prisoners

100 prisoners are condemned to life in prison, or so they think. One day the warden assembles all of the prisoners together and offers them a deal: “Starting tomorrow, I will select a prisoner at random every day and send him to a room with a lightbulb and switch. The prisoner may choose to turn the light on or off and must then leave. Now, here’s the deal: if, after visiting the room, a prisoner is convinced that all other prisoners have visited the room at least once, he may say so. If he is right, you will all be freed. If he is wrong, you will all be executed. After tonight, you will not be able to see or contact each other ever again, so devise a strategy now.” Morbidness aside, what strategy can the prisoners devise that guarantees their freedom?

After a friend first posed it to me, I’ve reasked this to several people over the years. Most who come up with a solution leave it that, but my uncle was not one of them.

“It’s going to take them too long to get out,” he noted. I sympathized, but somehow I wasn’t convinced there was a better solution, either.

Before getting to that solution and the time to release, how long would it actually take for the all the prisoners to visit the room at least once? The answer lies in the coupon collector’s problem. Given N prisoners, the time T for all of them to visit the room at least once can be expressed as the sum of geometric random variables T = T_1 + T_2 + \cdots T_N, where T_i is the time for ith new person to enter the room after the i-1th new person has been there. By linearity of expectation, the expected time for all prisoners to visit the cell at least once can then be expressed as follows:

\mathbb{E}[T] = \sum_{i=1}^N \mathbb{E}[T_i] = \sum_{i=1}^N \frac{N}{N+1-i} = N \cdot \sum_{i=1}^N \frac{1}{i} ~.

For N = 100, \mathbb{E}[T] \approx 519 days, so the expected time for everyone to visit the room at least once would be less than two years.

Now what about for our solution? First, what was our solution? It works as follows. The person who enters on the first day becomes the monitor, who is the only one allowed to turn the light switch off. Everyone else is allowed either to turn the switch on or leave it as is. The goal is for the monitor to count the number of people who have visited the room at least once by the number of times he enters the room and the switch is on. To do this, the remaining prisoners follow the following protocol: if the prisoner has never turned the switch on and sees it off, he turns it on; otherwise, he leaves it as is. This protocol prevents the monitor from overcounting the number of prisoners who have. Thus, once the monitor has entered the room with the switch on N-1 times (the monitor can ignore his first visit and automatically count himself), he can claim with certainty that each prisoner has visited the room at least once. However, once a switch is turned on, prisoners who have yet to visit the room must wait until the monitor visits the room again before they are allowed to indicate their entry, thereby delaying the monitors final announcement.

How long exactly does it take? Again, we can proceed by a sum of geometric random variables. Let U be the day the monitor announces every prisoner has been to the room at least once, which we express as the sum

U = T_1 + V_1 + M_1 + V_2 + M_2 + \cdots + V_{N-1} + M_{N-1}.

Here, M_i represents number of days between the ith time light is turned on, and the monitor’s next visit to the room. This is a geometric random variable with expectation \mathbb{E}[M_i] = N. Similarly, V_i represents the number of days between the monitor’s last visit and the ith prisoner that can turn on the switch for the first time. This is a geometric random variable with expectation \mathbb{E}[M_i] = \frac{N}{N - i}. Finally, T_1 is defined as before, and it is clear that T_1 = 1 day since whoever enters on the first day is automatically a first-time visitor to the room. By linearity of expectation, the expected time for the prisoners’ release is given as follows:

\mathbb{E}[U] = N \cdot \sum_{i=1}^N \frac{1}{i} + N \cdot (N-1)~.

For N = 100, \mathbb{E}[U] \approx 10,419 days, which is over 28 years! By that time, the warden will likely have retired and been replaced by a new warden who doesn’t respect the deal.

I have a bit more to say about the problem, but for now, I’ll leave you with a couple problems to consider.

  1. Show that for N > 3, there exists no strategy for which the expected time to release is equal to the expected time to collect N coupons.
  2. Find a nontrivial lower bound in terms of N for the minimum expected time to release.
This entry was posted in Probability, Puzzle. Bookmark the permalink.

3 Responses to 100 Prisoners

  1. Pingback: 4 Prisoners | Puzzled Over

  2. David says:

    There are also probabilistic solutions in which eventhough the monitor is not 100% confident of being right he can claim he is, greatly reducing the time spent in prison. A couple of numbers to look at might be:

    – The probability that all prisoners have gone to the room conditioned to the certanty that X have gone.

    – The unconditioned proba that all have gone to the room after X days. For instance it might be the case that without any control after 10,419 days the monitor certainty is also close to 100%.

  3. K says:

    Good point, David! An interesting question in this direction is how much the signaling could actually buy you. For instance, suppose you want to be 99% certain that all prisoners have visited the room. How much earlier can signaling get you to 99%?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s