Conditional Probability with Python

In this post we are going to explore conditional probability with Python. Here’s a fun and potentially tricksome question about probabilities:

In a family with two children, what is the probability that, if at least one of the children is a girl, both children are girls?

First of all let’s state a couple of assumptions which are not realistic in the “real world,” but which are fairly standard for theoretical probability questions.

1. There is an equal chance of a child being a boy or a girl.
2. Children are either girls or boys, exclusively.
3. The gender of the second child is independent of that of the first child.
4. By probability in this context, I’m assuming the following definition:

“The expected proportion of positive outcomes when repeating an observation over a large sample.”

Python Program to Show Probability of Two Girls.

We can explore this situation by simulation using Python’s `random` module. The code below calculates the (simulated) experimental probability of a family having two girls, given that at least one is a girl.

``````import random

sample_size = 1000

num_families_at_least_one_girl = 0
num_families_two_girls = 0

for i in range(sample_size):
first_child = random.choice(["boy", "girl"])
second_child = random.choice(["boy", "girl"])
if first_child == "girl" or second_child == "girl":
num_families_at_least_one_girl += 1
if first_child == "girl" and second_child == "girl":
num_families_two_girls += 1

result = round(num_families_two_girls / num_families_at_least_one_girl, 2)
print(f"Out of {sample_size} families sampled, {num_families_at_least_one_girl} have at least one girl.")
print(f"Of these {num_families_two_girls} have two girls.")
print(f"This gives an experimental probability of {result} to two decimal places that,")
print("given at least one child is a girl, both children are girls.")
``````

Sample output:

``````Out of 1000 families sampled, 768 have at least one girl.
Of these 268 have two girls.
This gives an experimental probability of 0.35 to two decimal places that,
given at least one child is a girl, both children are girls.
``````

This approach to the problem corresponds to the use of the formula for conditional probability:

In our scenario, this comes out as

The above solution corresponds to the following situation:

You visit a large number of families. In each family, you check if there are two children. If no, you ignore this family. If yes, you check if one of the children is a girl. If no, you ignore this family. If yes, you check if both children are girls.

However, different interpretations are possible.

Clearing Up Ambiguity

Different answers to the original question are possible, depending on how we interpret it, and also upon our assumptions. One of the most common alternative interpretations is regarding the phrase “at least one”. An ambiguity arises if we are not clear about whether it is the gender of a specific child which is known as opposed to knowing that one child is a girl, but not which one it is.

This version corresponds to the following situation:

You visit a large number of families. In each family, you check if there are two children. If no, you ignore this family. If yes, you wait until one of the children comes into the room. If it is a boy, you ignore this family. Otherwise you check if both children are girls.

This interpretation is equivalent to the different question:

In a family with two children, what is the probability that, if the younger child is a girl, both children are girls?

Here the conditional probability formula looks like this:

Python Code for Alternative Interpretation of Problem

The above scenario can be simulated by using just a slightly modified version of the Python code for the first interpretation. Notice how the conditional statements are different now, and the variables representing the boy/girl sampling are now `one_child` and `other_child`.

``````import random

sample_size = 1000

num_families_at_least_one_girl = 0
num_families_two_girls = 0

for i in range(sample_size):
one_child = random.choice(["boy", "girl"])
other_child = random.choice(["boy", "girl"])
if one_child == "girl":
num_families_at_least_one_girl += 1
if one_child == "girl" and other_child == "girl":
num_families_two_girls += 1

result = round(num_families_two_girls / num_families_at_least_one_girl, 2)
print(f"Out of {sample_size} families sampled, {num_families_at_least_one_girl} have at least one girl.")
print("You know which child this is.")
print(f"Of these families, {num_families_two_girls} have two girls.")
print(f"This gives an experimental probability of {result} to two decimal places that both children are girls.")
``````

The “Two Girls” conditional probability problem has caused a lot of discussion among mathematicians, and is a great example of how confusion can arise due to imprecise problem definition. It also illustrates the need for a certain intellectual humility. With over-confidence, our reasoning can be incorrect, and even if it isn’t, we may have made some assumptions which are not inevitable.

This post has explored the “Two Girls” conditional probability problem using Python programming. I hope you found it interesting and helpful.