## 4 Prisoners

The end of the 100 Prisoners post asked if there is a way to show that there does not exist a strategy that meets the coupon collector lower bound for release when there are $N > 3$ prisoners.

Let’s first establish strategies for $N \leq 3$. Note that for $N = 1$, the prisoner can declare victory on the first day, which trivially meets the coupon collector lower bound. Similarly, for $N = 2$, the first time a new prisoner enters after the first day, the prisoner can declare victory since there is only one other prisoner, who entered on the first day. Again, this trivially meets the coupon collector bound.

For $N = 3$, we actually need to use the light switch. On day 1, the first prisoner turns the light switch off. The next new prisoner (second prisoner) to enter turns the light switch on. The next new prisoner (third prisoner) to see the light switch on declares victory. Once again, this meets the coupon collector lower bound.

Why don’t we have luck for $N = 4$? We can arrive at it by contradiction: suppose there is a strategy that meets the coupon collector lower bound. Note that this requires the fourth prisoner to be able to determine based on the time of first entry and by looking at the light bulb whether or not he is the fourth prisoner. Without the light switch, all a new prisoner knows for any time $t \geq 4$ days is that he is not the first prisoner. Thus, if such a strategy should work, for $t \geq 4$, the light switch should uniquely identify whether or not three other prisoners have visited the room or not.

Without loss of generality, for $t \geq 4$, the switch will be on if and only if three of the prisoners have visited the room already. Suppose a prisoner enters the room for the first time on some day $t \geq 4$. If the light switch is off, then the prisoner must set the switch for the next day. However, the only information available to the prisoner is $t$ and the position of the switch, which indicates that either one or two prisoners have visited the room previously. If the prisoner sets the switch to on, and only one prisoner had visited the room before, the switch was set incorrectly. On the other hand, if the prisoner sets the switch to off, and two prisoners had visited the room before, the switch was also set incorrectly. Thus, we have arrived at a contradiction, and no strategy can achieve the coupon collector lower bound. Calculating an expected lower bound based on this argument and extending the argument to all $N > 3$ are left as exercises.

While the above argument indicates that the coupon collector lower bound is not tight for $N = 4$, it does not say whether the strategy given in the previous post is the best one can do. In fact, it is not, and what follows is a better strategy. Now, the light switch is used to indicate whether there have been an even or odd number of visitors to the room. The first visitor to the room switches it on, and on any prisoner’s first visit to the room thereafter changes the state of light switch:

1st prisoner changes the switch to ‘on’
2nd prisoner enters for the first time with the switch ‘on’ and changes it to ‘off’
3rd prisoner enters for the first time with the switch ‘off’ and changes it to ‘on’
4th prisoner enters for the first time with the switch ‘on’ and changes it to ‘off’

Note that since the third prisoner is only person to see the light switch off on his first time in the room, he can uniquely identify that there is only one prisoner left, and the next time he enters the room and sees the switch is off, he can declare victory. Likewise, if the first or second prisoner reenter the room after the third prisoner and before the fourth prisoner, then he can also figure out that there is only one prisoner left and can declare victory the next time he sees the switch off. It turns out the probability that both of them visit, which results in an expected time to release of

4/3 + $\mathbb{E}$ [coupon collector],

is 1/3; the probability only one of then visits, which results in an expected time to release of

2 + $\mathbb{E}$ [coupon collector],

is 1/3; and the probability that neither visits, which results in an expected time to release of

4 + $\mathbb{E}$ [coupon collector],

is 1/3. Thus, the expected time to release is $\frac{22}{9} + \mathbb{E}$ [coupon collector], where $\mathbb{E}$ [coupon collector] $= \frac{25}{3}$. This drops the expected time after coupon collector from $12$ to about $2.4$. Of course, we are taking advantage of the the fact that the number of prisoners is so small. I suspect it will be more difficult to make such pronounced improvements over the earlier strategy for larger $N$.