Siege Rolls; Confirmation Bias my ARSE!!!

  • We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Displacement

Second Lieutenant
72 Badges
Jan 31, 2012
144
79
  • Crusader Kings II
  • Europa Universalis IV: Mare Nostrum
  • Stellaris - Path to Destruction bundle
  • Europa Universalis IV: Third Rome
  • Victoria 2: Heart of Darkness
  • Victoria 2: A House Divided
  • Stellaris: Synthetic Dawn
  • Semper Fi
  • Victoria: Revolutions
  • Europa Universalis IV: Res Publica
  • Magicka
  • Heir to the Throne
  • Hearts of Iron III: Their Finest Hour
  • Hearts of Iron III
  • For the Motherland
  • Europa Universalis IV: Wealth of Nations
  • Europa Universalis IV: Art of War
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: The Old Gods
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: The Republic
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Sword of Islam
  • Europa Universalis III
  • Europa Universalis III: Chronicles
  • Divine Wind
  • Cities in Motion 2
  • Europa Universalis IV: Conquest of Paradise
  • Hearts of Iron IV: Death or Dishonor
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Hearts of Iron IV: Together for Victory
  • Stellaris: Leviathans Story Pack
  • Stellaris: Digital Anniversary Edition
  • Europa Universalis IV: Cradle of Civilization
  • Europa Universalis IV: Rights of Man
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Cadet
  • Stellaris
  • Crusader Kings II: Conclave
  • Europa Universalis IV: Cossacks
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Way of Life
  • Europa Universalis IV: El Dorado
  • Stellaris: Nemesis
  • 500k Club
  • Victoria 2
I guess this problem is already resolved but here's a fun little matlab script for generating data samples comparable to OP's:

-------
nrolls = 100;
nruns = 1000;

data = randi([1, 14], nrolls, nruns);
mu = mean(data);
sigma = std(mu);
-------


In this case the "variable" is the ensemble mean of a bunch of siege rolls, and then we take the standard deviation of all the means to see what the cross-section of a set of 100 rolls looks like. The standard deviation of the set of means converges on ~0.4 the bigger the data set is.

Okay, so, in English? 100 random integers on 1-14 will have a mean that's up to 0.4 off of the expected mean (7.5) 68% of the time. They'll be even further from the expected mean 32% of the time! Which means you'll have an average of 7 or below or 8 or above 32% of the time. That's quite often!

And the odds of getting two 14's in a set of 100 rolls is just a binomial random variable. Here's another Matlab script to plot the distribution over a range of possible outcomes (numbers of 14's rolled):

-------
nrolls = 100;
outcomes = 0:20;
p = 1/14;
q = 13/14;
for k = 1:length(outcomes);
v(k) = nchoosek(nrolls,outcomes(k))*(p)^outcomes(k) * (q)^(nrolls-outcomes(k));
end

stem(outcomes,v)
-------

Here's an example of that plot:

https://postimg.org/image/a4m90pjwz/


So at around a 0.02 (2%) probability, rolling only two of one number is a little unlikely. But a 2% chance isn't nothing! And getting a much lower number than you'd expect (0, 1, 2, 3 or 4) has a pretty decent chance of happening (15%).

Anyway we spent a lot of time analyzing data from the game but not a lot of time analyzing our null hypothesis (what would it look like if it was actually random?). So that's worth point out!