Table of Contents

Rollouts

Introduction to rollouts

The million dollar question is simple enough: out of all the games that could result from playing this position, how many do we win (and how many of our wins and losses are gammons, and how many are backgammons)? The model is exactly the same as if we had an urn with a googol balls in it (it’s a big urn), and many of the balls have win written on them, and some say gammon loss, and if we look hard enough there are a few that read backgammon win, and so on. 1) Instead of having the patience to count the googol balls, we just give the urn a really good shake and then pull 100 balls out without looking, and say for instance “Well, I got 53 wins, 31 losses, 9 gammon wins, 6 gammon losses, and a backgammon win – looks like my equity’s roughly +0.26.” and go home. If we were a bit more thorough, we could go a bit further and figure out that by cheating and measuring the sample proportions instead of the population proportions, we introduced a standard error of 0.06 into our result. (Of course, the trick is to select a sample size that’s big enough that you reduce the standard error to a tolerable level, but small enough that the answer arrives before you get bored.)

It will come as no surprise that a rollout with a limited number of trials follows exactly the same procedure. It’s sufficient to say that the proportion of wins/gammons etc. that come up when GNU Backgammon plays against itself (say) 1296 times, aren’t likely to vary all that much from the proportion we would get if we measured the proportion of results in every game we could possibly get of GNU Backgammon playing against itself. (Of course, there may still be some doubt whether the results of GNU Backgammon vs. GNU Backgammon are representative of the results of a perfect player vs. a perfect player, or of you vs. Joe Average, but that’s another story.)

Rollouts in GNU Backgammon

In GNU Backgammon the Rollout function implements the procedure described above, with the following improvements:

Truncation: instead of rolling out all the way to the end of the game, it can stop and pretend its evaluation after a few plies is perfect. This may obviously introduce some amount of systematic error, but in practice this may not matter because:
- it makes rollouts much faster, which means you can do more of them (and thus trade sampling error for systematic error);
- different positions will be reached in different trials, so the correlation between errors in each trial weakens and the errors cancel out to some extent;
- if you are rolling out the positions after making different plays, then any remaining systematic error between the two rollouts is likely to be somewhat correlated and so the error in the comparison between the plays is hopefully small. This implies that truncated rollouts are better for estimating relative equity (“which is the better move here, 13/10*/9 or 13/10* 6/5*?”) than absolute equity (“at this match score I need 29% wins to accept a dead cube; can I take in this position?”).
Race database truncation: when the game enters its 2-sided bearoff database, GNU Backgammon can estimate the probability of winning from that position with no error at all (it can play and evaluate endgame positions perfectly), which saves time and avoids introducing the errors that can result from large equity variances at the end of the game.
Variance reduction: when using lookahead evaluations, it can reduce errors by making use of the equity difference from one ply to the next. (This can be interpreted as either cancelling out the estimated “luck” (ie. the difference in equity evaluations before and after rolling) or using subsequent evaluations to estimate the error in prior ones; the two views are equivalent). GNU Backgammon automatically performs variance reduction when looking ahead at least one ply.
Stratified sampling: uses quasi-random number generation instead of pseudo-random number generation (this is a standard technique in Monte Carlo simulations where having a near-perfect uniform distribution in your sample is more important than unpredictability). GNU Backgammon only stratifies the first 2 plies of a rollout, though it would be easy enough to extend it to the remainder.

Quasi-Random Dice

Quasi-Random Dice are used to reduce the element of luck in rollouts. Instead of selecting purely random dice, GNU Backgammon will ensure a uniform distribution of the first roll of the rollout. If 36 trials are requested, one game will start with 11, two games with 21, two games with 31, etc. In general, if n * 36 games is requested, n games will start with 11, 2*n games with 21 etc. This is called rotation of the first roll. Similarly, if n*1296 trials is requested, the second roll will be rotated, such that n games will start with 11-11, n games with 11-21, n games with 21-21, etc. The third roll be also be rotated if the number of trials is proportional to 46656.

Suppose a user stops a 1296 trial rollout after 36 games. The 36 games would have had the following rolls for the first two rolls of each game: 11-11, 21-11, 12-11, 31-11, 13-11, …, 66-11 Obviously such a rollout will give skewed results since the second roll was 11 for all games! To avoid this problem GNU Backgammon will randomise the sequence of rolls such that it is guaranteed that for any sample of 36 games you have exactly one game with first roll 11, exactly one game with second roll 11, etc. This is called stratification.

GNU Backgammon will actually also rotate and stratify rollouts where the number of trials are not multiples of 36, 1296, etc. The distribution of rolls is obviously not uniform any longer in this case, but it will still provide some reduction of the luck, i.e., no 37 trial rollout will have 3 games with a initial 66.

Before the first game of a rollout, GNU Backgammon creates a pseudo random array which it will use for all the games in the rollout. In effect it has already decided the roll sequence it will use for up to 128 rolls in every game of the rollout. In other words, for a normal rollout where games don’t go over 64 moves, every single game of every possible rollout length has already had its dice sequence determined. During the rollout of game n, sequence n will be used, for game n+1 sequence n+1, etc. If it’s a rollout as initial position, then whenever the current sequence starts with a double, the sequence is skipped and the dice routine moves on to the next sequence. Say an rollout as initial position is about to start using sequence 275, but that sequence begins with a double. The dice routine moves to sequence 276. On the following game, it will use sequence 277 (it remembers how many it has already skipped).

So, if you select rollout as initial position and 36 games, then you will get a prefect set of rolls for games 1..30 and the first 6 rolls of the next perfect set (the same rolls you would have gotton for games 31..36 if you’d asked for 1080 games or 10800 games or 92 games or whatever.

The dice sequence doesn’t know how many trials it will be asked for, it simply generates sequences such that for a normal rollout (rollout as initial position) every 36 (30) games you get all possible 1st rolls, every 1296 (1080) games get every possible first 2 rolls, every 46656 (38880) games you get full sets of 3 rolls, etc.

How to set up a rollout

FIXME Explain all the options in the rollout setting dialog.

1) Balls and urns are to probability theorists what teapots and chequerboards are to computer graphics researchers, or “squeamish ossifrage” is to cryptographers – they seem to come with the territory.