• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Eighth Test

I decided to redo the 5% data using the event to change RR rather than changing the catholic tolerance. This should be a good consistency check. So, once more I started from the original file, and adjusted tolerances to get -5% revolt risk everywhere. I the fired the event that raises the revolt risk by 5% twice, so that the revolt risk is 5% everywhere that is has French culture, no nationalism, and is not the capital. I ran the same 13 tests from the console. The results (forts that fell to rebels) are:
Test 1: Alsace
Test 2: Bretagne, Lyonnais, Picardie
Test 3: Dauphine
Test 4: None
Test 5: Languedoc, Maine, Morbihan
Test 6: None
Test 7: Bretagne, Champagne, Savoie, Vendee
Test 8: None
Test 9: Dauphine, Provence
Test 10: Savoie
Test 11: Maine, Morbihan
Test 12: Limousin, Lorraine
Test 13: Bretagne, Dauphine

This gives me 273 tests with minimal forts at 5% revolt risk. Seventeen times the fort fell, for a probability of 6.2%, which is entirely consistent with the result from changing the catholic tolerance where there were 16/275 instances of the fort falling or 5.8%. Combining all my tests with minimal forts and 5% RR, Pf is 38/600 or 6.3%. The 95% confidence range for Pf(5%, minimal fort) is 4.3% to 8.3%.
This confidence interval has some interesting implications. It excludes 4%, which is what my model of Pf=2*(RR-3) would give. It excludes the hypothesis that Pf for 5% and 10% RR are the same, which is no surprise. It also exclude the hypothesis (from my data) that either 2% or 3% RR has the same Pf as 5% RR. It matches Fat's confidence interval for 5% RR (3.4%-7.4%) very nicely.
My new 'best guess' is that Pf=2*(RR-2.5). This fits the results very nicely. It would predict a 1% chance at 3% RR, where Fat's 95% confidence interval is 0.8%-1.8%, and mine is <4.4%. And yes I have no idea at all why they would ahve chosen 2.5%, I'm just trying to salvage my model :)
The 7% results are still very strange. The next thing I'll do is repeat those using the event to change RR. I also want to have a try at 3% to see if can confirm Fat's results.
 
Last edited:
Oct 27, 2002
1.075
0
Visit site
After realizing that revoltrisk is recalculated after the event is finished, I modified my setup for 150 revolts.

The results:
Code:
2	0;0;0;0;0;0;0;0		0.0%
3	1;3;2;2;4;8;2;1		1.9%
4	8;4;4;3;5;4;7;7		3.5%
5	6;5;8;13;7;8;7;13	5.6%
6	9;15;11;17;10;22;6;9	8.3%
7	15;19;9;19;11;11;20;20	10.3%
8	7;17;13;19;21;12;18;19	10.5%
9	21;23;25;15;27;20;17;23	14.3%
10	22;36;23;28;24;26;21;25	17.1%
11	25;27;26;29;32;35;28;27	19.1%
12	35;28;24;27;23;31;31;31	19.2%
13	42;33;39;35;38;32;41;42	25.2%
14	29;34;30;33;33;31;32;35	21.4%
15	36;38;40;41;46;39;38;37	26.3%
16	43;44;36;43;45;38;31;43	26.9%
17	47;42;41;47;41;46;38;32	27.8%
18	53;40;57;42;44;40;50;52	31.5%
19	50;57;49;58;43;54;51;57	34.9%
20	65;46;70;52;52;52;54;55	37.2%
25	65;67;55;65;67;61;68;58	42.2%
30	63;62;50;61;70;67;55;55	40.3%
35	64;65;52;62;65;60;64;66	41.5%
40	61;62;70;64;53;68;73;69	43.3%
First column is RR. Following eight columns are number of forts that fell (out of 150 revolts). The last column is Pf in % based on 1200 revolts.
The relationship looks linear in the region between 2 and 30, so I calculated the coefficients for this relationship:
Code:
Pf = a*RR + b
Pf = 2.00*RR - 4.12 ;  R^2 = 0.98
approx
Pf = 2*( RR - 2 )
I used only values from 3-20 RR (both values included). This predicts that cutoff at 40% Pf is reached for 22% RR.

That's about it. Now let's get to things I don't like.
First, the upper limit cutoff seems to be consistently (and slightly) higher that 40%. 40% is round and reasonable number, but I'm unconvinced that the data support it.
Second, values between 11% and 16% RR do not look nice (they look 'bumpy') on the graph. This is most visible from the fact that for the data Pf(13%)>Pf(14%). Predicted linear relationship says otherwise. I don't know why is this so. Maybe, more revolts are needed, but number of forts that fell for 13% RR are consistently higher than for 14% in each 'experiment'.

I'll do the region between 20% - 25% RR in order to find at which RR the cutoff is reached. Of course, I hope it will be 22% RR in agreement with the linear prediction.

With all that said, I'm satisfied with the results and I'm ready to move on to fort level 2. Maybe, we'll be smarter after we acquire some data for level 2 forts.

Comments and ideas are more than welcome.
I don't feel like posting a graph of tha data and the fit. Sorry.
 
Last edited:

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Quick comments
-I'd like to work out some confidence limits to put something numerical on your "dislikes". I agree that it looks a bit ugly at first glance.
-On higher forts, I would assuem that the relationship is linear, and start out testing (say) level 4 forts at 10% RR or so (well away from either 22% or 2%). It should then be simple to show that the reduction in Pf at 10% is the same for other revotl risks.
-Sorry I haven't done much testing of my own. Was away and have been doing other things.
 

unmerged(3931)

General
May 19, 2001
2.032
0
Visit site
Isaac Brock said:
Combining all my tests with minimal forts and 5% RR, Pf is 38/600 or 6.3%. The 95% confidence range for Pf(5%, minimal fort) is 4.3% to 8.3%.
With so many trials, I do not understand why your confidence range is so wide. Isn't there a statistical formula for predicting the expected value of a number of yes/no outcomes (like coin tosses) that can be applied?
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Yippee, a math question! :)

[math]
Part of the reason is that I'm calling for a 95% confidence level. Often you'll see research usig a 50% confidence level, which would give me a band that is about 3 times smaller. The standard deviation for the binomial distribution (if you have enough trials to assume the central limit theorem) is simply
sigma = sqrt(x(1-p))
where x is the number of sucesses, and p is the (true) probability. We use x/n where n is the number of trials as our best guess for the true probability.

So in the case you quoted, x=38, n=600. Our best guess for p is 6.33%. Therefiore the standard deviation is sqrt(38*(1-6.33%)) = 5.97 tests. Divide this by n to get 1.0% as the standard deviation. From the normal distribution we know that to go to 95% confidence we need to use two sigma. (Actually it's 1.96, but I've used 2 consistently in all the posts here). So our range is +/- 2%, and the 95% confidence limit is 4.3%-8.3%.

QED
[/math]

So in plain words I'm requiring a high level of confidence which means I end up with wide confidence bands.
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
First, I want to apologize ahead of time. There will be a lot of statistical mumbo jumbo in this post. I’ll put what I think are plain English summaries in bold

Next, I’ve added 95% confidence intervals to Fat’s data, based on the uncertainties in the binomial distribution. The results are (I’ll discuss the “p-value” later):

Code:
Revolt Risk  Pf    Lower Limit    Upper Limit   p-value
      2%    0.0%      0.0%           0.2%       100.0%
      3%    1.9%      1.1%           2.7%        83.3%
      4%    3.5%      2.4%           4.6%        34.6%
      5%    5.6%      4.3%           6.9%        53.0%
      6%    8.3%      6.7%           9.8%        75.3%
      7%   10.3%      8.6%           12.1%       70.4%
      8%   10.5%      8.7%           12.3%        9.0%
      9%   14.3%     12.2%           16.3%       80.4%
     10%   17.1%     14.9%           19.3%       31.9%
     11%   19.1%     16.8%           21.4%       34.0%
     12%   19.2%     16.9%           21.4%       46.3%
     13%   25.2%     22.7%           27.7%        1.1%
     14%   21.4%     19.0%           23.8%        2.9%
     15%   26.3%     23.7%           28.8%       84.4%
     16%   26.9%     24.4%           29.5%       39.7%
     17%   27.8%     25.2%           30.4%        9.4%
     18%   31.5%     28.8%           34.2%       70.9%
     19%   34.9%     32.2%           37.7%       50.5%
     20%   37.2%     34.4%           40.0%       40.3%
     25%   42.2%     39.3%           45.0%       12.9%
     30%   40.3%     37.4%           43.1%       86.0%
     35%   41.5%     38.7%           44.3%       29.2%
     40%   43.3%     40.5%           46.2%        2.0%

Fat said:
The relationship looks linear in the region between 2 and 30, so I calculated the coefficients for this relationship:
Code:
Pf = a*RR + b
Pf = 2.00*RR - 4.12 ;  R^2 = 0.98
approx
Pf = 2*( RR - 2 )
I used only values from 3-20 RR (both values included). This predicts that cutoff at 40% Pf is reached for 22% RR.
I did the same fit. My result for the slope is 2.0029 and the RR to be subtracted is 2.044%.
Those numbers are awfully close to 2% RR risk and a slope of 2, so I assumed that the correct function is indeed 2*(RR-2). There are 19 data points and (arguably) 2 free parameters. (Arguably because we’ve dropped the data for points above 25% so we’ve made an assumption of a cap there, and because we haven’t tested revolt risk below 2%. Also arguably because I’m not using the fit for the slope and intercept, but rather assumed values.) Using the known uncertainty for each point I get a Chi-sqaured value for this fit of 22.93. With 17 degrees of freedom the p-value for this Chi squared is 15.1%. In other words there is a 15% chance that the data would be at least this far from the striaght line fit as they are. That number is small enough to be a bit troubling, but large enough that no-one can realistically say that the data don’t support the fit.

Although the data have a couple of points that are odd for the Pf=2*(RR-2) model, all of them can be explained by this model

First, the upper limit cutoff seems to be consistently (and slightly) higher that 40%. 40% is round and reasonable number, but I'm unconvinced that the data support it.
I’d make a stronger statement. The data do not agree with a 40% cutoff.
-The 95% CL range for the 40% RR point excludes a 40% cutoff
-The best estimate in cases like this is to exclude points that may be below or at the cap (i.e., the 25% RR point) and throw the rest together. From your 30%, 35% and 40% data there are 1501 cases of the fort falling out of 3600 trials. The 95% CL for these results is 43.34%>Pf>40.05%. As such 40% is (just) excluded at the 95% confidence level.
-Add in the 25% point (which is dodgy) and the 95% CL range is 43.24%>Pf>40.39%
-When I sum all my results for 25% RR and higher and minimal forts I have an estimated Pf of 294/689. The 67% CL range here is 40.8%-44.6%. So even my small number of trials excldue 40% at the 67% confidence level
-It is true that your previous data set had a very nice mean of 39.97% for all RR over 20%. However, if I exclude the 100% RR point (more on that later) I get a 95% CL range of 39.4%-43.3% (1033/2500 trials). This only just excludes 40% at the 95% confidence limit. For the 100% RR point the range is 29.0% - 37.4%. The probability that this point would be so far away from all the others (40%, 50%, 60%, 200, 400%) is 0.08%. As such the data show that Pf at 100% is NOT the same as the average of all the other revolt risks, and it is appropriate to drop this point when calculating the mean.
-If I add all of your results and my results together (again, excluding the 100% RR point) there are 7989 trials and 3334 successes. Pf is 41.73% with a standard deviation of 0.55%. Although this requires a fair number of assumptions, it gives a p-value for the cap being 40% of 0.17%.

In other words the three independent tests all point towards a cap that is close to Pf=42%. In none of the data set is 40% consistent at 67% CL, the most recent test rules it out with 95% certainty and the previous one with better than 90% certainty. Combined they rule it out at a 99.8% probability.

The cap is not at 40%, but is most likely (2/3 chance) between 41% and 42.5%. However Pf(100%) is not consistent with the cap.

Second, values between 11% and 16% RR do not look nice (they look 'bumpy') on the graph. This is most visible from the fact that for the data Pf(13%)>Pf(14%). Predicted linear relationship says otherwise. I don't know why is this so. Maybe, more revolts are needed, but number of forts that fell for 13% RR are consistently higher than for 14% in each 'experiment'.
I think there is some small cause for concern here. The (95%) confidence intervals for all these points do overlap, so the number of tests isn’t decisive. However, I looked at all of the pairs of points between 10% and 16% RR and worked out the probability that the difference between the pairs would be as much as it is compared to the expected 2*(RR1-RR2). This is a total of 21 pairs, and they were chosen because they looked bad, so we should expect p-values at about the 1% level randomly. However, there are 4 pairs that have very low p-values, 10% and 14% with p=0.04%, 11% and 14% with p=0.05%, 13% and 14% with p=0.09%, and 13% and 16% with p=0.05%. The actual numbers are

Pf(14%)-Pf(10%)= 4.3%, expected =8%
Pf(14%)-Pf(11%)= 2.3%, expected =6%
Pf(14%)-Pf(13%)=-3.8%, expected =2%
Pf(16%)-Pf(13%)= 1.8%, expected =6%

Each of these, taken alone, could be expected to happen less than 0.1% of the time. That said, there are 506 pairs of data points, so we should expect to find one pair with a p-value of about 0.2%.

This sort of analysis is tricky because it’s hard to interpret. A better gauge of whether anything is wrong here is the linear fit, which showed a 15% chance of the data being randomly at least as far off the fit as they are. What I think I’ve shown here is that most of the data that explain why the odds are as low as 15% are in the 11%-16% data points.

The p-values listed in my table are the probability for each point that it would be at least as far from the fit as it is. As you can see there are only 3 points with really low p-values, 13% RR, 14%RR, and 40%RR. Changing the cap from 40% to 42% gets rid of problem with the 40% RR data. Again there are 23 data points so we expect to get p-values of around 4%.

If something is wrong with the Pf=2*(RR-2) model it is most likely to be found in the 11%-16% RR range

Finally I tried fitting all the data at once. I varied the slope (from 2), the RR at which Pf is zero (from 2%) and the cap (from 40%). The best fit I could find was
Pf=MIN ( 1.991*(RR-2.015%) , 41.8%)
This had a chi squared of 25.1 with 23 degrees of freedom. The p-value for this fit is 19.6%, so the odds that the data could be at least this far from the fit through chance are 20%. This isn’t entirely comfortable, but I would consider this an acceptable fit.

When I apply the fit with a revised cap of 42% to my data set (RR of 2%, 4%, 5%, 7%, 10%, 14%, 21%, 25%, 40%) I get a p-value of 35%. Almost all of the discrepancy lies with Pf(7%). Applied to Fat’s first data set I get a p-value of 22%, after excluding the 100%RR data point, which does not fit the model. Almost all of the remaining discrepancy comes from his 7% RR point (which is inconsistent with mine).

We are very close to a consistent model that will fit all of the results. The discrepancies lie at 100% RR (tested once) and possibly at 7% RR (tested 3 times).

The results at 7% RR are:
IB1 3.0%-8.7%
F1 11.4%-17.8%
F2 8.6%-12.1%

Which is odd. My next test will be to run this one again.

I'll do the region between 20% - 25% RR in order to find at which RR the cutoff is reached. Of course, I hope it will be 22% RR in agreement with the linear prediction.

I predict 23% :)
 
Last edited:

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Nineth Test

I've now redone the 7% data. Once more I'm using the event to change RR rather than changing the catholic tolerance. I ran the same 13 tests from the console. The results (forts that fell to rebels) are:
Test 1: Champagne, Maine, Orleannais
Test 2: Cevennes, Lorraine, Normandie
Test 3: Berri, Savoie
Test 4: Alsace, Dauphine, Normandie, Picardie
Test 5: Bretagne, Dauphine, Vendee
Test 6: Berri, Provence
Test 7: Armor, Bretagne, Limousin, Maine, Picardie
Test 8: Alsace, Bearn, Bourgogne, Orleannais
Test 9: Bearn, Gascogne, Limousin, Provence, Vendee
Test 10: Alsace, Poitu
Test 11: Bretagne, Languedoc, Normandie
Test 12: Gascogne, Lnaguedoc, Normandie
Test 13: Armor, Auvergne, Gascogne

This gives 273 tests with minimal forts at 5% revolt risk. Thirty one times the fort fell, for a probability of 11.4%, which is not the same as the previous test where I had 16 out of 274, or 5.8%. The discrepancy is 5.5% +/- 2.4% (one sigma) and the p-value for these two results is almost 99%.

Why the discrepancy? I'm not sure. It doesn't come from the rebels that were already on the map in the original test. It is perhaps possible that for some of the tests I might have loaded the 5% save file, but I regularly checked the revolt risk before firing the test event. I also wrote them down prior to one particular test.

So while I'd like to throw out the first set of data I have no justifiable reason to do so. So overall I now have 547 trials at 7% and a minmum fort, with the fort falling 47 times. The 95% CL range is therefore 6.2% to 11.0%, which is consistent with both of Fat's tests at 7% and with the 10% predicted by the working guess.

Applying the revised data with the working model I get a Chi squared of 2.71 with 9 degrees of freedom or a p-value of 97% (there is a 97% chance that randomly distributed data would be at least this far from the model).

I'm beginning to believe that we now have enough data to stop calling the model a "guess" and start calling it a "fit". My only concerns are
1) Why I didn't see a single fort fall at 3% RR. With 6 trials there is a 75% chance that I should have seen one fort fall. This is a pretty minor point, Fat's data clearly show that forts fall at 3% RR
2) What's going on with Fat's 100% RR data point? I don't think this is too important though - if the fit only works for RR below 50% I think that's all you need to play the game.
3) The weird behaviour Fat sees at 13%, 14% and 16%. I do have 286 trials with 74 successes at 14% RR. As such my result is Pf of 25.9% with a 95% CL range of 20.7% to 31.1%. Fat's result was 21.4% with a range of 19.1% to 23.8%. My numbers are closer to the fit than his. I'm inclined to discount the problems with his results in this range, but it's hard to be sure that's the right approach.
 
Oct 27, 2002
1.075
0
Visit site
Before I start with the topic one more thing. I tested again if revoltrisk is recalculated within the event. In the same event I raised RR to 40% fired revolts for 150 provinces. Not a single fort fell to the rebles. So, the conclusion is that RR is NOT recalculated within the event.

About the topic. I filled the 20-25% RR gap to find the upper limit cutoff and also tested 0% and 1% RR.
Code:
0	0;0;0;0;0;0;0;0		0.00	0
1	0;0;0;0;0;0;0;0		0.00	0
2	0;0;0;0;0;0;0;0		0.00	0
3	1;3;2;2;4;8;2;1		1.92	2
4	8;4;4;3;5;4;7;7		3.50	4
5	6;5;8;13;7;8;7;13	5.58	6
6	9;15;11;17;10;22;6;9	8.25	8
7	15;19;9;19;11;11;20;20	10.33	10
8	7;17;13;19;21;12;18;19	10.50	12
9	21;23;25;15;27;20;17;23	14.25	14
10	22;36;23;28;24;26;21;25	17.08	16
11	25;27;26;29;32;35;28;27	19.08	18
12	35;28;24;27;23;31;31;31	19.17	20
13	42;33;39;35;38;32;41;42	25.17	22
14	29;34;30;33;33;31;32;35	21.42	24
15	36;38;40;41;46;39;38;37	26.25	26
16	43;44;36;43;45;38;31;43	26.92	28
17	47;42;41;47;41;46;38;32	27.83	30
18	53;40;57;42;44;40;50;52	31.50	32
19	50;57;49;58;43;54;51;57	34.92	34
20	65;46;70;52;52;52;54;55	37.17	36
21	49;61;66;50;60;49;53;61	37.42	38
22	60;70;66;73;61;59;41;57	40.58	40
23	68;52;66;62;62;60;70;54	41.17	42
24	72;62;64;56;64;61;58;58	41.25	42
25	65;67;55;65;67;61;68;58	42.17	42
30	63;62;50;61;70;67;55;55	40.25	42
35	64;65;52;62;65;60;64;66	41.50	42
40	61;62;70;64;53;68;73;69	43.33	42
First column is RR. Following eight are number of forts that fell in 150 revolts, giving a total of 1200 revolts. Next column is Pf estimated from the data. The last one is the current model proposed by Isaac:
Pf=0 for RR=<2
Pf=2*(RR-2) for RR in [3:22]
Pf=42 for RR>=23

I fitted the data with linear fit Pf=a*(RR+b) in the range [3:22] excluding both extremes. The coefficients are a=2.006 +/- 0.051 and b=2.06 +/- 0.31. This makes Isaac's model very plausible.

As a final touch here is a plot of the data and the linear fit:
eu2.gif
 
Last edited:
Oct 27, 2002
1.075
0
Visit site
Here is a preview of Pf for fort level 2.
Code:
3;0
4;0;0;0
5;0;2
6;6
7;9
9;14
40;65
First column is RR. Others are number of forts that fell in 150 revolts.

4% RR risk seems to be lower limit cutoff. Judging by the Pf for 40% RR, upper limit cutoff is still 42%.
So, my prediction is:
Pf_lev2=2*(RR-4)
There is no evidence for the slope of 2, but what else could it be... ;)
 
Oct 27, 2002
1.075
0
Visit site
I tested one more thing, but unfortunately it doesn't work.

I tried to increase revolt risk to 2.9% (floating point number). Although the game loads fine and fires the event, it seems that the number read by the game is integer. The tooltip and RR shown in province window indicated that it would be so.

I fired 450 revolts anyway, but no forts fell. According to our last assumptions 1.8% of forts were supposed to fall. It could have been just bad luck, but I doubt it.

Anyway, I was hoping that I will be able to test more points in the 11-16% RR region in order to get more accurate picture.
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Wow, nice work. I think we really do have the answer for minimum forts.

I'm not surprised that fractional values don't work. There is presumably a fair amount of overhead for carrying around all the extra digits, so it makes sense to truncate the RR.

I haven't done any testing - wasted the long weekend doing other things instead of wasting the long weekend testing EU2 :). So instead I'll just provide some analysis for your data (which are better than mine anyway!).

1) I did a complete fit of all of your data from 2% to 40%. There are 3 free parameters, the slope (starts at 2) the offset on the RR (starts at 2%) and the cap for Pf (starts at 42%). The best fit is slope of 1.9935, offset of 2.016%, and cap of 41.61%. With those parameters the Chi squared for the fit is 25.75 with 24 degrees of freedom - in other words the deviation from the fit is very close to that expected for random data (the expected value for chi squared is the number of degrees of freedom). The probability that a randomly selected set of data would have this much deviation from the fit is 36.6%. So it looks great.

The 95% confidence limits for the free variables in this fit are:
slope = 1.946 - 2.041
offset = 1.910% - 2.12%
cap = 40.23% - 42.98%

Assuming that it's coded with round numbers (why wouldn't it be?) the only ambiguity is on the cap, which could easily be 41.5% or 42%. (Why choose either? I don't know, but I suspect we have a round number with some reduction per fort level).

I also took all of the points that are unambiguously over the cap (24%-40%) to have a seperate estimate of the cap. This gives 6000 trials and 2502 sucesses. The best estimate of Pf from these data is 41.70%, with a 95% CL range of 40.43%-42.97%. So pretty much the same as before, and it uses mostly the same data so the result should be the same :). Overall, I think that the current model shows excellent predictive power.

2) I wanted to have another look at the 13% and 14% data. You have 8 trials, and for 7 of them Pf(13%)>Pf(14%). According to our fit, with 150 revolts the expected number of sucesses for 13% is 33 and the expected number for 14 is 36. We expect a difference of 3 sucesses between the two trials, but that standard deviation is 7.29 sucesses. As such, on any given trial the odds that the 13% test will give more sucesses than the 14% test is 34%. Therefore the probability that we will have the 13% higher than the 14% 7 out of eight times is 8*(0.34)^7(0.66). (The factor of eight is because there are eight ways this can happen.) This gives a probability of 0.28%, or one in 360 trials. The p-value I found above was 0.09%. This is lower because we have several trials where Pf(13%) wasn't just higher than Pf(14%) but was quite a bit higher. Including all the real numbers makes this event look a little less likely. Any way, per our model the data include 20 tests of adjacent revolt risks. So the odds that for one pair seven out of eight estiamtes of PF(lower)>Pf(higher) by chance are about one in eighteen. That's a number that is troublingly low, but just within my 95% confidence limit. Not enough to torpedo the model, but enough to continue to pay attention.

4% RR risk seems to be lower limit cutoff. Judging by the Pf for 40% RR, upper limit cutoff is still 42%.
So, my prediction is:
Pf_lev2=2*(RR-4)
There is no evidence for the slope of 2, but what else could it be..
3) I agree that it will almost certainly turn out to be 2. But to add some numbers
-The 95% CL for Pf(4%, small fort) is 0.66%. Rules out 1%, and given that you've seen forts fall at 5%, I agree that the cutoff is 4%. Great because it looks like the cutoff is simply 2%*(fort level).
-The 95% CL for the 40% result (where you have 65/150) is 35%-51%. To me this means that we have no real idea of where the cap might actually be. I would still bet on it being lower than 42% - why would 42% have been chosen? If I had to guess I would go for something like:
cap = 45%-3%* (fort level).
This is not yet ruled out by your data, but we'll see.

And for testing (as you are far ahead of me, and I expect it will stay that way) I would suggest that you focus on the points around 4% RR and 24% RR. It is quite reasonable to assume a straight line in between the two based on the data for minimal forts.
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Moving on (finally) to medium forts

I finally got up the initiative to run some more tests. I increased all the fort levels in France by 2 so that all are medium (except for Ile de France and Savoy). I then changed the religious tolerance and fired the RR event twice so that the RR is 7% in almost all provinces, 8% in the 4 non-culture core provinces (Bearn, Morbihan, Bretagne, Armor), 5% in Ile de France, 10% in Savoy, and 11% in Alsace.

I fired the revolt event 13 times. In the eleventh trial Bretagne fell to the rebels, but no other forts did.

This pretty much rules out the model that has a thresold of 2*(Fort Level). Under this model I'd expect Pf=2% at 7%RR for a medium fort, and 4% for 8% RR. In fact the upper limit for Pf(7% RR, medium fort) is 1.1%. (0 forts fell out of 273 trials). I have 65 trials at 8% RR and only one fort fell. So the best estimate for Pf(7% RR, medium fort) is 1.5%, and the upper limit is 4.5% (95% CL). I'm guessing this will turn out to be 2%.

Togther these suggest that the threshold for medium forts in 7% RR.

As an aside, for medium forts I have 0/13 at 11% RR and 3/13 at 12% RR.

Because speculation is more fun than running tests I hereby suggest that the threshold may be given by the sum of the existing fort level and all fort levels below it plus one.

In other words my guess for the threshold is
Code:
FL           Threshold
1                 2%
2                 4%
3                 7%
4                11%
5                16%
6                22%

I'm going to try a test with medium forts at 9% RR next.
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
Making progress.

I've run two more tests. First I did 9% RR for Medium forts with the exact same procedure as before. The results for the 13 tests were:

Trial 1: None
Trial 2: Bourgogne, Lyonnais
Trial 3: Guyenne, Languedoc
Trial 4: None
Trial 5: Dauphine, Lorraine
Trial 6: Armor, Poitu, Vendee
Trial 7: Maine, Morbihan, Normandie, Savoie
Trial 8: None
Trial 9: None
Trial 10: Cevennes, Gascogne
Trial 11: None
Trial 12: Orleannais
Trial 13: Guyenne

I then set the same thing up for 11% RR on most provinces The results for 13 tests were:

Trial 1: Morbihan
Trial 2: Gascogne, Lyonnais, Provence, Vendee
Trial 3: Lyonnais
Trial 4: Auvergne, Caux, Gascogne, Picardie, Poitu
Trial 5: Languedoc
Trial 6: Auvergne, Savoie
Trial 7: Provence
Trial 8: Auvergne, Berri, Provence, Vendee
Trial 9: Alsace, Armor, Bourgogne, Caux, Maine, Provence
Trial 10: None
Trial 11: Berri, Bourgogne, Guyenne, Lorraine
Trial 12: Alsace, Auvergne, Lyonnais
Trial 13: Languedoc, Nivernais

So the results I now have for medium forts (95% CL ranges) are:
Code:
RR          Result         Pf
7%          0/274         <1.1%
8%          1/65          <4.6%
9%          14/273        2.5%-7.8%
10%         2/52          <9.2%
11%         29/286        6.6%-13.7%
12%         5/65          1.1%-14.3%
13%         0/13          <20.6%

These show good agreement with my current guess (Pf=2*(RR-7%)). However, only the 7% RR point disagrees with the original guess (Pf=2*(RR-6%)). My best bet to nail this down is 8% RR, which I plan to do next. After that will be 30% to try to find the cap.
 
Last edited:

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
I've run the "8%" case for medium forts. This includes some trials at 9%, 11% and 12%. The results are:

Trial 1: Poitu
Trial 2: Alsace
Trial 3: Alsace
Trial 4: Bourgogne
Trial 5: None
Trial 6: Armor, Bearn, Maine
Trial 7: Armor
Trial 8: None
Trial 9: Guyenne
Trial 10: Picardie
Trial 11: Lorraine
Trial 12: Lorraine, Normandie
Trial 13: None

So the results I now have for medium forts (95% CL ranges) are:

Code:
RR          Result         Pf                     probability
7%          0/273         <1.1%                       0%
8%          9/329         0.9%-4.4%                   2.7%
9%          17/325        2.8%-7.7%                   5.2%
10%         2/52          <9.2%
11%         29/286        6.6%-13.7%                 10.1%
12%         7/78          2.5%-15.4%
13%         0/13          <20.6%

The agreement is excellent with the current guess (Pf=2*(RR-7)). While all the points but the 7% RR one continue to pass through the model that has a 6% cutoff, the match is far superior with the 7% cutoff. The best fit slope is actually 2.52. As a little inspection will show a slope of 2.5 and a cutoff of 7% goes through all the high statistics poins. However, with all the points so close together there isn't much sensitivity to this. Points at high RR will nail down the slope much better.

Next stops are 30%, 25%, and 20% RR. Hopefully these will help me find the slope and the cap.
 
Oct 27, 2002
1.075
0
Visit site
I finally finished level 2 fort test.

Here is the table:
Code:
0	0;0;0;0;0;0;0;0		0
1	0;0;0;0;0;0;0;0		0
2	0;0;0;0;0;0;0;0		0
3	0;0;0;0;0;0;0;0		0
4	0;0;0;0;0;0;0;0		0
5	0;2;1;0;1;0;2;1		0.58	
6	6;6;3;5;2;4;0;6		2.67
7	9;12;2;7;5;11;10;5	5.08
8	12;8;9;10;10;12;12;12	7.08
9	14;13;12;10;16;11;15;13	8.67
10	16;16;14;19;18;13;15;15	10.5
11	20;22;22;19;27;21;20;16	13.9
12	15;28;22;24;21;21;17;30	14.83
13	28;21;32;21;24;26;21;26	16.58
14	27;25;27;27;31;29;28;26	18.33
15	32;32;30;31;24;36;32;24	20.08
16	29;30;43;42;26;28;44;28	22.50
17	35;34;33;38;30;29;34;33	22.17
18	37;32;39;39;37;39;51;43	26.42
19	38;36;40;34;47;39;43;48	27.08
20	56;53;48;51;52;58;52;46	34.67
21	52;48;46;54;38;43;54;62	33.08
22	46;46;65;51;48;52;51;55	34.50
23	55;54;51;55;62;52;48;46	35.25
24	57;53;56;69;58;70;61;57	40.08
25	71;67;57;65;69;62;63;64	43.17
30	75;66;52;66;74;66;60;64	43.58
35	57;56;52;55;59;56;71;48	37.83
40	65;62;60;63;54;60;68;56	40.67
First colum is RR. Following eight are number of forts that fell in 150 revolts. The last one is the Pf estimated from the data.

Linear fit in the region [5%:23%] gives:
Code:
Pf = 1.98*RR - 9.06;	R^2=0.9871
approx
Pf = 2*(RR-4.5)
This region is chosen cause it safely avoids both lower limit cutoff and top cap on Pf.

Of course there are a couple of strange points:
RR = 20% to high then linear fit expectation
RR = 35% to low for top cap limit
It's also difficult to see from the data what's the value of the top limit and for which RR the top limit is reached.
Maybe, Isaac will be in the mood to sort this out?

Next step for me will be the highest fort level (6).
 

unmerged(6159)

Field Marshal
Oct 23, 2001
9.458
1
Visit site
At your service :). I have the spreadsheet set up already, so it's pretty easy.

I’ve added 95% confidence intervals to Fat’s data, based on the uncertainties in the binomial distribution. The results are (I’ll discuss the “p-value” later):

Code:
Revolt Risk  Pf    Lower Limit    Upper Limit   p-value
      5%    0.6%      0.1%             1.0%        5.8%
      6%    2.7%      1.7%             3.6%        47%
      7%    5.1%      3.8%             6.4%        90%
      8%    7.1%      5.6%             8.6%        91%
      9%    8.7%      7.0%             10.3%       68%
     10%    10.5%     8.7%            12.3%        57%
     11%    13.9%     11.9%           15.9%        36%
     12%    14.8%     12.8%           16.9%        87%
     13%    16.6%     14.4%           18.7%        70%
     14%    18.3%     16.1%           20.6%        55%
     15%    20.1%     17.7%           22.4%        43%
     16%    22.5%     20.1%           24.9%        68%
     17%    22.2%     19.8%           24.6%        1.8%
     18%    26.4%     23.9%           29.0%        65%
     19%    27.1%     24.5%           29.7%        14%
     20%    34.7%     31.9%           37.4%        0.8%
     21%    33.1%     30.4%           35.8%        95%
     22%    34.5%     31.8%           37.2%        72%
     23%    35.3%     32.5%           38.0%        20%
     24%    40.1%     37.3%           42.9%        44%
     25%    43.2%     40.3%           46.0%        13%
     30%    43.6%     40.7%           46.5%        7.1%
     35%    37.8%     35.0%           40.6%        2.4%
     40%    40.7%     37.8%           43.5%        81%

Again I repeated your fit using all the values from 5% RR to 23% RR. I get pretty much the same results you do, a slope of 1.978 and a threshold of 4.58% RR. Just like last time I then assume that the slope is really 2 and the offset is really 4.5%, and that there are 2 independent parameters (the last one is still questionable). With 15 (assumed) degrees of freedom the chi squared is 23.7. The p-value for this is 12.8%. Just like last time the p-value is high enough that we can’t say that the fit doesn’t work, but is small enough to be a little troubling.

More than 2/3 of the total chi squared comes from three points, RR=5%, RR=17%, and RR=20%. At 5%RR the model call for an RR of 1%, the observed number is 0.6%. The expectation is very sensitive to the exact value of the threshold (4.5%) so this variation could easily be reduced if the threshold were nearer the fitted value (4.58% by my fit). At 17% RR we expect Pf of 25%, but get 22.2%. At 20% RR we expect Pf= 31%, but get 34.7%. For both these cases the model is outside the 95% CL. Still a p-value of 15% isn’t terrible, and the model looks like it works.

For the cap, there are only three points that are unambiguously above the cap, RR of 30%, 35%, and 40%. For these we have 1465 forts falling in 3600 trials, or Pf=40.7%. The 95% CL for the threshold is 39.1%-42.3%. Each of the three points (30%, 35%, 40%) overlaps this range so it’s reasonable to assume that they have the same Pf. There is a question about the 35% RR point, but I’ll get at that later on.

For minimal forts (2nd trial) the cap was 41.7% with a 95% CL range of 40.4%-43.0%. The difference between these two results is 1.01%, and the 95% CL is –1.06% to –3.08%. So there is a hint that the cap might have a small dependence on fort size, but the data show excellent consistency with the hypothesis that the cap is independent of fort size.

Assuming that the cap is the same for both fort sizes we can combine all the trials (including Fat’s first data set post 43). The total is the 5166 forts falling out of 12600 trials. So Pf is 41.0% with a 95% CL range of 40.12%-41.44%. This sort of rules out my guess of 41.5%. I can then compare each individual test with the overall mean. For the original RR=100% test Pf was 33.2%, and the p-value is 0.029%. There is definitely something wrong with that result, so I’ll remove it from the data used to calculate the cap. This revision gives 5000 forts falling in 12100 trials. Pf= 41.32%, with a 95% CL range of 40.43%-42.22%. With this mean the only point that is inconsistent is the small fort at RR=25%, and the p-value there is 1.8%. I’d call that troubling, but not convincing evidence that the point should be discarded. My data show a cap of 40.8%-44.6%, which is entirely consistent.

Why this number should be around 41.3% is a mystery to me. 5/12 is 41.7% and 7/17 is 41.2%, but I see no reason why these would be used.

I also tried to fit all of your data for the small fort simultaneously. The independent parameter are the slope (starts at 2), threshold (starts at 4.5%), and cap (starts at 41.3%, the mean of the data from 25%-40%). The best fit gives a slope of 2.018, and threshold of 4.70% and a cap of 41.0%. The chi squared here is 30.5 with 21 free parameters. This gives a p-value of 8.3%, which is quite low, but doesn’t really show any inconsistencies. The low probability is due to the expected three points, 17% RR, 20% RR and 35% RR. Moving the threshold has made the 5% RR data point fully consistent with the fit.

Because this seems to show up some problems I tried to do the fit without the points from 30% RR and higher. We don’t expect to hit the cap at 25% RR. The cap was fixed at 41.31%. The best fit was a slope of 2.0182 and a thre4shold of 4.69%. The chi squared for this fit was 22.1 and so the p-value was 39.5%. Based on this fit the 95% CL range for the slope is 1.976-2.060 and the 95% CL for the 95% CL range for the threshold is 4.54%-4.84%.

All in all everything is basically consistent with the model (Pf=2*(RR-4.5%) with a cap at 41%. However, there is limited evidence that the threshold may be a bit higher than 4.5%, and a suggestion that the cap may be a bit higher than 41%.
 
Oct 27, 2002
1.075
0
Visit site
I just found this table sitting on my laptop and I remembered I promised to do this test.

The table for lvl6 forts.

Code:
10	0	0	0	0	0	0	0	0	0.00
13	0	0	0	0	0	0	0	0	0.00
14	0	0	0	0	0	0	0	0	0.00
15	0	1	2	1	3	1	2	0	0.83
16	1	1	2	7	1	5	6	2	2.08
17	5	7	9	12	9	5	9	6	5.17
18	14	7	9	13	12	13	13	11	7.67
19	11	15	11	11	12	13	15	11	8.25
20	22	15	11	19	16	21	23	23	12.50
21	25	18	22	20	20	22	15	18	13.33
22	21	20	14	22	32	17	23	18	13.92
23	21	22	37	22	20	22	25	23	16.00
24	24	32	27	22	26	27	30	29	18.08
25	30	29	28	33	24	34	28	22	19.00
26	32	32	41	23	31	36	35	40	22.50
27	26	38	38	31	47	39	37	45	25.08
28	43	42	35	47	46	45	48	55	30.08
29	43	51	44	48	44	45	43	37	29.58
30	41	52	48	51	52	36	57	54	32.58
31	51	44	44	45	50	45	53	51	31.92
32	57	61	48	45	47	51	52	59	35.00
33	50	56	59	55	58	63	52	47	36.67
34	55	56	52	52	52	64	47	52	35.83
35	66	59	59	48	71	61	69	63	41.33
40	56	53	65	66	68	68	61	59	41.33
45	56	55	61	60	64	55	64	57	39.33
50	63	69	59	64	58	67	67	78	43.75
First column is RR%, Following eight are number of forts that fell in 150 revolts. The last one is the Pf estimated from the data.

Linear fit in the range 15-34% RR gives:
Code:
Pf = 1.98*RR - 28.65;	R^2=0.9871

I hope Isaac Brock has still his spredsheet ready and is willing to do his thing.
I think it's time for the final formula including RR and fort level...
Isaac?

I have my data ready to test if the terrain has any influence on Pf, but I need to write a small program to do that for me.