One might think that I might try to get more sleep during
Spring Break. I tried. It just didn’t work. (I have occasional insomnia.)
It was my own fault. I had been thinking about one of the
rule modifications for the game I’m playtesting. It is called Bios Genesis and
designed by Phil Eklund of Sierra Madre Games. (Planned release is Fall 2016.) For a very brief overview of the
game, you can read this earlier post that describes how my research, a
boardgame, watching TV, and thinking about time travel merged into a wild idea.
But now to the matter at hand: I was thinking about the probability of rolling
triples given N dice.
But first, a little context is needed. The object of the
game is to create life, sustain it, and hopefully evolve into a thriving
organism. The currency (how you pay for upgrades) in the game comes through
catalysts. In the early stages of the game, catalysts are obtained via dice
rolls that allow you to cycle nutrient molecules. Basically you are trying to
set up a robust autocatalytic cycle, which fits well with some origin-of-life
scenarios. At some point, the autocatalytic cycle can be evolved into simple
life – microorganisms. Once they arrive on the scene, each microorganism has to
perform a Darwin Roll. The number of dice rolled depends on how many “chromosomes”
the organism contains. The aptly named chromosomes are colored cubes in this
game. The four colors represent abilities in four areas: metabolism (red),
specificity (yellow), entropy (green) and heredity (blue).
If your organism has red cubes signifying its metabolism,
you gain catalysts when you roll ones. But you might not have red cubes, or you
might be unlucky and not roll ones. (Such is life.) In some playtests, there
was a significant shortage of catalysts in the early stages. Thus a rule was
suggested that for every triple rolled, your organism also receives a catalyst.
This led to a few games that were flush with catalysts, while others might
still be relatively lean. It also depended on whether players were more
cooperative or conversely more competitive. Clearly the more dice rolled, the
higher your chance of getting triples. With this new rule, an organism has
better metabolism if it has more chromosomes, not just the red ones. But how
does it scale?
[Warning: Some Math Ahead]
This is an easy problem, I thought to myself. Note to self:
Don’t work on such exciting things shortly before going to bed. (Hence, the
title of my blog post.) Here’s my reasoning. If three dice are rolled, the
number of possibilities is 63 or 216. There are just six ways of
getting a triple (all ones, all twos, all threes, all fours, all fives, all
sixes). So the probability is 6/216 or 2.8%. Another way of calculating this is
6 x (1/6)3 = 6/216. The probability of rolling three equal dice is
(1/6) x (1/6) x (1/6) but there are 6 ways you could do this. Or you
could say that it doesn’t matter what you roll for the first dice so this is
6/6, but after it is rolled the other two dice must match if this is to be a
triple, i.e., (1/6)2 = 1/36 or 2.8%.
What if you roll four dice? Now the total is 64
or 1296. As long as you have three dice equal, the fourth one shouldn’t matter.
So now you have 4 x 6 x 6 x (1/6)4 where the factor of four is
because any one of the four dice could be the one that doesn’t matter, and one
of the factors of 6 is because this dice could be any number from one to six. This yields 144/1296 or 11.1%.
(This turns out to be wrong, but I hadn’t realized it yet.) Note that you could
also have written this as 4 x 62/64 or 4 x (1/6)2.
How about five dice? Well, now you have two dice that don’t
matter so that gives you 63/65 but there should be 10
ways that you could do this, i.e., the dice that don’t matter are (1,2), (1,3),
(1,4), (1,5), (2,3), (2,4), (2,5), (3,4), (3,5), (4,5) where the numbers refer
to the 1st, 2nd, 3rd, 4th or 5th
dice as the ones that don’t matter. So the probability would be 10 x 63/65
= 2160/7776, i.e., 27.8%. (This will also turn out to be wrong.) This could
also have been written as 10 x (1/6)2 = 10/36. At this point, I
think I see a pattern. For N dice, you can calculate the probability my
multiplying two factors. The first is N!/(3!(N-3)!) and the second is 6N-2/6N.
I am led astray because back in week 3 of the semester, I
was teaching students the Boltzmann distribution. A good way to count the
number of ways a particular arrangement of particles can take, as they spread
themselves in different quantum states, is to count marbles in boxes. If you
are tossing N marbles into boxes and the relative box sizes are given by g1, g2, g3,
… and the number of marbles in the respective boxes are N1, N2, N3,
… then you can calculate the number of arrangements W using the formula
below. (In a quantum model, g is the
degeneracy of the particular state.)
Assuming fair dice, this sort of resembles a six-box problem
with boxes of equal sizes. Hence, all the values of g are 1. The number of marbles is the number of dice rolled. That
factor of 4 that I used for four dice and the factor of 10 that I used for 5
dice, well those are just 4!/3!1! and 5!/3!2! which fits with N!/3!(N-3)! and
sort of resembles the ratio shown in the formula above. This ratio is part of
Pascal’s Triangle. I have highlighted the relevant section in the figure below.
It hasn’t taken me much time to get to this point and I am
feeling very pleased with myself. (This is what happens when you get sloppy.) I
now merrily plug numbers with increasing N. When I get to N = 7, the
probability is 0.972. This can’t be right. I go ahead and plug in N = 8 and
sure enough, I get a number larger than 1 and I know my simple formula is dead
wrong. Here’s the table below.
You would think I’d have figured this out
earlier simply by seeing that Factor 2 is unchanged at (1/36) and since Factor
1 is simply an increasing multiplicative integer, once it passes 36 (and it
does so very quickly), this becomes nonsense. It’s time for bed. I’m tired. And I’m somewhat dejected by
my own idiocy. Why did I think it was going to be that simple?
I go to bed. Maybe 3 hours later I’m awake. But now my mind
can’t stop working on the problem. I’m trying to get back to sleep but my mind
has shifted gears to try a simpler problem – rolling doubles rather than
triples. If you roll just two dice, then the probability is 61/62
= 1/6. That’s easy. What if you roll three dice? Is it 3 x 62/63
or 3 x 1/6 using similar reasoning as I did earlier? That would yield 108/216
or 50%. What if you roll four dice? That would be (4!/2!) x (1/6) which is
exactly 1. That’s clearly wrong. You should only hit a probability of one when
rolling seven dice because you could roll (1,2,3,4,5,6) with six dice. I’m
still doing this in my head lying in the dark. Okay, I go back to the three
dice problem. I imagine all the possibilities in a matrix. The 216
possibilities can be written as a 3 x 36 matrix that I can systematically
populate. And if I think of the 3 as xyz coordinates in Cartesian space, I can
imagine a cube of length six. When x and y are equal (a double is rolled), then
z can take any value. This leads to a plane parallel to z and along the
diagonal y=x. There should be two similar planes for x=z and y=z. These planes
bisect each other and that’s why my earlier formula didn’t work. I must have
double-counted, triple-counted, etc., the intersections! Unfortunately I’m
stuck at this point in the dark since I’m not very good at trying to visualize
these three intersecting planes. Worse, even if I do figure it out, I probably
can’t do the four-dimensional problem representing rolling four dice. I try to
think about something else and eventually get back to sleep.
In the morning before going into work I sketch out the three
planes and I can guess the intersection. But this is going to be more
problematic for higher dimensions. I start writing out sequences (over
breakfast) to get a sense of where I might be double-counting and I’m able to
quickly figure out that each intersecting plane has double-counted 6
possibilities so instead of 3 x 36 = 108 in the numerator, it should be 36 + 30
+ 30 = 96. The probability is 96/216 or 44%. I make a quick stab at the
four-dice problem and it’s clear that much larger chunks of the matrix are
being double-counted, but I don’t have the time or patience to figure it out.
I’ve clearly learned that it’s not going to be so easy.
I get busy and leave the problem for several days. I
contemplate writing a simple script that generates N random integers from 1 to
6 (for N dice rolls) and then checks to see if a triple is rolled. I could then
run maybe a million trials and get some statistics. The problem is that 6N
grows very quickly. When N=10, 6N is about 60 million, so I’d have
to sample much more. Not only that, I’d need to go look for a better random
number generator than the simple rand( ) function or its equivalent.
Today I decide that I’m going to do this systematically. I’m
a lazy and lousy coder. Hence I write a short script that generates the 3 x 6N
matrix. I then write a second script that goes in and checks in each case how
many triples are in each combination. This might be useful later because in the
actual game, rolls of fives and sixes could cause an error catastrophe. If
errors exceed the number of blue (heredity) chromosomes, your organism suffers
atrophies. Yellow (specificity) chromosomes allow you to reroll some of the
dice. And some mutations (when DNA has evolved) confer additional stability
whereby only sixes cause errors. Given that 6N starts to blow up
exponentially and my script is inefficient with lots of I/O writing out files
with 6N lines, you can imagine that this bogs down after a bit on my
laptop. I could submit the job to my computational cluster at work but this
doesn’t seem right. Anyway in my playtests so far, it’s not often that you have
to roll more than ten dice, so I let my laptop work while I do something else.
Here are the results for N=3 to 11 for triples.
Looks like 7 dice get you to at least half a chance of
getting a triple. I haven’t done a further analysis taking into account
rerolls. A fair assumption is that if able, the player will try to reroll fives
and sixes to avoid errors. Assuming success this reduces the player’s final
number of triples by a third. I’ll give feedback to the designer who can decide
if the rule stays or goes.
And that’s how my episode of Metabolism Probability Insomnia
transpired.
No comments:
Post a Comment