How Random am I?
Human vs. Computer Random Number Generation
I've always thought I could be a good random number generator. Let's compete against an actual random number generator and see how random we really are.
To make patterns easier to detect, I chose numbers from 1–5 instead of 1–100. With fewer possible values, statistical tests should reveal any bias much more quickly. I generated 500 numbers using Python's pseudorandom number generator and then manually entered 500 numbers myself.
Computer-generated sequence
import random
computer_list = [[1, 4, 3, 1, 4, 5, 2, 1, 4, 1, 2, 3, 5, 4, 1, 5, 4, 2, 3, 5, 5, 5, 1, 4, 2, 4, 1, 4, 1, 5, 1, 2, 3, 3, 4, 2, 5, 5, 3, 5, 1, 4, 3, 2, 5, 5, 5, 3, 1, 1, 5, 4, 3, 4, 1, 1, 1, 2, 1, 3, 4, 3, 3, 4, 2, 2, 2, 5, 5, 2, 4, 1, 1, 1, 1, 5, 2, 2, 3, 4, 4, 3, 5, 5, 2, 4, 2, 4, 5, 5, 5, 5, 3, 5, 3, 1, 5, 3, 1, 3, 4, 4, 1, 4, 4, 2, 2, 4, 1, 3, 4, 2, 5, 2, 4, 5, 1, 3, 2, 1, 1, 3, 4, 4, 5, 5, 2, 5, 5, 5, 5, 3, 3, 4, 2, 2, 4, 5, 5, 3, 3, 4, 3, 2, 5, 4, 2, 4, 4, 2, 1, 4, 2, 3, 2, 4, 2, 5, 5, 1, 5, 4, 1, 2, 2, 1, 2, 5, 3, 2, 4, 2, 4, 5, 1, 5, 5, 3, 4, 1, 2, 3, 1, 4, 5, 5, 3, 5, 1, 2, 3, 5, 3, 1, 1, 5, 2, 4, 4, 3, 3, 3, 2, 3, 5, 3, 5, 5, 3, 5, 1, 3, 5, 2, 1, 2, 1, 2, 4, 5, 1, 5, 1, 4, 5, 2, 2, 3, 5, 2, 1, 4, 4, 4, 5, 5, 2, 1, 1, 4, 5, 4, 1, 1, 2, 4, 4, 5, 3, 3, 3, 1, 4, 2, 3, 4, 5, 4, 3, 1, 2, 5, 4, 3, 3, 2, 3, 5, 5, 4, 1, 4, 2, 2, 2, 4, 2, 3, 3, 1, 1, 4, 1, 3, 5, 2, 3, 3, 3, 1, 1, 3, 3, 3, 1, 5, 5, 1, 4, 1, 2, 2, 3, 1, 1, 4, 2, 4, 2, 2, 1, 4, 3, 5, 2, 4, 3, 5, 2, 1, 1, 3, 3, 2, 1, 5, 3, 4, 4, 5, 4, 1, 4, 4, 3, 4, 3, 2, 4, 5, 2, 3, 3, 3, 1, 4, 2, 3, 3, 3, 3, 3, 5, 5, 4, 4, 1, 5, 4, 5, 5, 3, 1, 5, 4, 2, 3, 5, 5, 3, 2, 4, 2, 1, 4, 1, 4, 5, 3, 4, 5, 1, 2, 2, 3, 4, 4, 2, 3, 2, 5, 5, 5, 4, 3, 4, 1, 3, 3, 5, 4, 5, 4, 4, 1, 4, 1, 1, 3, 2, 1, 3, 1, 2, 4, 5, 2, 2, 1, 5, 1, 3, 3, 2, 5, 1, 4, 2, 4, 4, 3, 3, 5, 2, 4, 4, 3, 1, 2, 5, 4, 2, 5, 4, 3, 4, 3, 1, 3, 1, 3, 4, 2, 1, 5, 2, 1, 4, 2, 3, 1, 1, 5, 5, 5, 1, 1, 4, 4, 5, 2, 3, 4, 4, 5, 4, 5, 1, 2, 1, 1, 2, 4, 5, 3, 4, 4, 4, 3, 3, 5, 3, 2, 3, 1, 4, 1, 2, 5, 2]]
Human-generated sequence
human_list = [[1, 2, 3, 5, 2, 3, 1, 3, 2, 2, 3, 5, 3, 4, 3, 4, 1, 3, 2, 4, 2, 3, 3, 4, 4, 3, 5, 3, 4, 3, 3, 2, 1, 3, 2, 4, 3, 3, 4, 5, 3, 4, 2, 1, 3, 2, 3, 2, 1, 3, 4, 2, 3, 2, 3, 2, 3, 2, 4, 2, 4, 3, 4, 2, 4, 4, 3, 4, 5, 3, 3, 4, 3, 2, 1, 1, 1, 2, 1, 3, 3, 2, 4, 4, 3, 3, 2, 5, 3, 3, 4, 4, 4, 2, 3, 2, 3, 2, 1, 2, 3, 3, 1, 1, 1, 3, 2, 3, 2, 4, 3, 3, 4, 5, 3, 5, 3, 4, 3, 4, 2, 3, 2, 2, 3, 1, 4, 2, 4, 3, 5, 3, 5, 3, 4, 3, 4, 3, 4, 3, 4, 2, 3, 1, 2, 2, 1, 1, 3, 3, 2, 5, 4, 3, 5, 5, 3, 5, 3, 2, 2, 3, 4, 4, 5, 5, 3, 2, 1, 2, 4, 2, 2, 3, 5, 3, 5, 3, 5, 3, 5, 3, 5, 3, 4, 3, 4, 4, 4, 3, 4, 3, 4, 3, 4, 3, 3, 4, 3, 4, 2, 3, 2, 3, 1, 3, 2, 4, 3, 5, 3, 5, 3, 5, 3, 4, 4, 2, 3, 2, 3, 2, 4, 3, 4, 3, 4, 3, 4, 4, 3, 4, 3, 4, 1, 2, 2, 1, 2, 1, 3, 4, 4, 4, 5, 3, 4, 2, 3, 2, 3, 2, 5, 3, 5, 3, 4, 4, 3, 4, 3, 2, 3, 1, 1, 2, 1, 3, 2, 4, 3, 3, 4, 3, 4, 3, 4, 3, 5, 3, 5, 3, 5, 4, 5, 4, 5, 2, 2, 1, 3, 1, 2, 3, 3, 3, 4, 4, 5, 5, 2, 3, 4, 3, 2, 1, 2, 3, 4, 5, 2, 2, 2, 4, 4, 5, 4, 3, 2, 4, 2, 5, 3, 5, 3, 2, 2, 3, 2, 1, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 1, 3, 3, 4, 2, 4, 2, 1, 2, 2, 4, 3, 4, 3, 5, 3, 4, 2, 4, 2, 4, 1, 3, 2, 4, 3, 5, 2, 4, 2, 4, 2, 5, 2, 5, 3, 5, 3, 3, 4, 3, 5, 3, 5, 3, 5, 3, 4, 5, 3, 3, 4, 3, 4, 3, 4, 3, 4, 3, 3, 4, 4, 5, 5, 5, 5, 5, 3, 3, 2, 4, 4, 3, 5, 5, 3, 2, 3, 3, 3, 3, 3, 3, 3, 2, 3, 5, 2, 2, 2, 2, 3, 3, 3, 4, 5, 4, 3, 4, 3, 2, 3, 2, 2, 2, 1, 2, 2, 4, 4, 3, 4, 3, 5, 3, 3, 4, 3, 4, 3, 4, 3, 5, 3, 5, 4, 3, 2, 3, 2, 1, 2, 2, 1, 2, 1, 3, 2, 4, 3, 5, 3, 5, 3, 5, 4, 5, 4, 5, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4]]
Entering 500 numbers manually was surprisingly exhausting. Toward the end I stopped trying to be clever and just started typing whatever came to mind.
Shannon Entropy
The first metric I tested was Shannon entropy, which measures the uncertainty of a probability distribution.
For a discrete random variable,
where
For five equally likely outcomes, the theoretical maximum entropy is
Using the entropy function below,
from collections import Counter
import math
def calculate_shannon_entropy(data):
counts = Counter(data)
total = len(data)
entropy = 0
for count in counts.values():
p = count / total
entropy -= p * math.log2(p)
return entropy
I obtained
Computer List Entropy: 2.3186
Human List Entropy: 2.1645
The computer sequence is essentially at the theoretical maximum, while my sequence is noticeably lower. This suggests that I subconsciously favored some values over others.
Frequency Test
Entropy alone only measures the overall distribution, so I next performed a chi-square goodness-of-fit test.
The test statistic is
where
The results were
--- Computer Frequency Test ---
Total elements: 500
Number 1: 18.80%
Number 2: 18.20%
Number 3: 20.20%
Number 4: 22.00%
Number 5: 20.80%
Chi-Square Statistic: 2.3400
--- Human Frequency Test ---
Total elements: 500
Number 1: 7.80%
Number 2: 20.40%
Number 3: 34.60%
Number 4: 24.00%
Number 5: 13.20%
Chi-Square Statistic: 106.1000
The bias is immediately obvious. I strongly favored 3, used 4 somewhat more often than expected, and rarely selected 1. One possible explanation is that pressing 1 requires my left pinky, making it slightly less natural to type repeatedly, although that may simply be post-hoc rationalization.
A chi-square statistic of
with four degrees of freedom is extraordinarily unlikely under a truly uniform random process.
Transition Matrix (Lag-1 Structure)
Uniform frequencies are only one aspect of randomness. A sequence can have perfectly balanced frequencies while still containing predictable patterns.
To investigate this, I computed the lag-1 transition matrix, whose entries estimate
For an ideal random generator, every row should be approximately
The transition matrix was computed using
def get_transition_matrix(data):
matrix = np.zeros((5, 5))
for i in range(len(data) - 1):
matrix[data[i] - 1, data[i + 1] - 1] += 1
row_sums = matrix.sum(axis=1)
return matrix / row_sums[:, np.newaxis]
The resulting heat maps make the difference visually apparent: the computer-generated sequence is close to uniform, while the human sequence develops noticeable preferred transitions between values.
Comments