Anatomy of a Game: Simulation Optimization

blackninja9939

Experienced Programmer - Crusader Kings 3
77 Badges
Aug 28, 2013
2.366
6.056
  • Crusader Kings III
  • Stellaris: Federations
  • Battle for Bosporus
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Crusader Kings II
  • Crusader Kings II: Holy Fury
  • Imperator: Rome - Magna Graecia
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Old Gods
  • Europa Universalis IV: Rights of Man
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Synthetic Dawn
  • Surviving Mars
  • BATTLETECH
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Colonel
  • Stellaris Sign-up
  • Hearts of Iron IV: Expansion Pass
  • Stellaris: Apocalypse
  • Stellaris: Lithoids
  • Stellaris: Distant Stars
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Imperator: Rome Deluxe Edition
  • Prison Architect
  • Imperator: Rome Sign Up
  • Stellaris: Ancient Relics
  • Age of Wonders: Planetfall
  • Crusader Kings II: Conclave
  • Crusader Kings II: The Republic
  • Victoria 2
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Crusader Kings II: Way of Life
  • Stellaris
  • Mount & Blade: Warband
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Legacy of Rome
Hello everyone and welcome back to Anatomy of a Game, I’m Matthew a programmer on Crusader Kings 3 and today I am going to be talking about some optimisations I’ve done for 1.5 to put some of the higher level examples we’ve shown off in the previous two posts in a more concrete context.

I’ve cherry picked these examples to show off a variety of different optimisation techniques and areas to apply to. None of them alone is a groundbreaking thing to speed up everything, and often fixing one slowdown will reveal more ones anyway but every little helps. We’re already quite fast in the simulation and have most of the “easy” things done, but these should still provide some nice improvements both in early and late game.

Measure Measure Measure​

Doing optimisations means nothing if you are not profiling and measuring your code, you cannot just eyeball it and hope your new code is faster you have to get concrete numbers to prove it.
At Paradox we use a variety of tools for this, many common in the industry as a whole, as either third party applications or things integrated into our engine.

For lower level more isolated cases we’ve got an integrated version of Google Benchmark in Clausewitz, this is excellent for measuring the impact of changes to very isolated pieces of code. Usually not very useful for our games since we rarely have easily isolatable chunks, but it is something I make frequent use of when doing optimisations to our engine’s more foundational types like our array and hashmap classes or our string and text utilities.

For wider application level profiling we use more general profilers. They usually come in two types, instrumental and sampling. Instrumental profiling requires you to manually decorate your code with macros to instruct the profiler to measure in a given section. A sampling profiler periodically queries the entire application to see what is happening.

For instrumented profiling we make use of an integrated version of Optick where we decorate some of our code and can get results based on varying categories and across different threads.

1_Instrumental_Profiling.png

[Function with profiling instrumentation]

Here you can see that in our in-game interface when we update the open windows and view we profile the entire function as well as specific name subsections.

For sampling profilers common ones we use on CK3 are Very Sleepy and Intel’s VTune. You attach them to your process and run them over the course of X amount of time then they compile together some results for you. Here is an example output from Very Sleepy in my recent debug build run.

2_Sampling_Profiling.png

[Part of the daily tick serial calls]

3_Sampling_Profling_Second.png


[A closer look at the council task’s functions and time]

To monitor trends over time we also have automated tests running the game every night and outputting to log files performance metrics which the test system then gives us a nice graph for. These tests are for more specifically our daily tick and render times as opposed to application wide. Usually one sees the test showing a problem then digs into it more specifically with a profiler.

These tools all help us to identify bottlenecks and then validate if we’ve solved them or not.

Script System​

One of the slowest operations you can do in the game is going into the script system, especially if you do it frequently and make use of more complex effects launching on actions and events which do even more logic.

This is a balancing act we are constantly dealing with, because of course we want our Content Designers (and modding community) to be able to change loads of the systems in our game and fill it with great content. But exposing too much or in inefficient ways can really hurt us in terms of speed.

In 1.5 I spent some time optimising some systems that were handled heavily in script but were very isolated and could easily be moved into a more performant code system, which for some issues entirely eliminates them from the profiler.

Character weight​

One thing that was a small chunk of the time but showed up proportionally hugely more than it should have was how much time we spent applying and removing the modifiers for when characters gained and lost weight and were considered obese or malnourished. This was a primary example of the slowness of the script system compared to code.

Every couple of years we’d update the weight of a character and trigger an on action that their weight changed, from there we’d trigger events to add or remove the obese and malnourished modifiers based on two constant values we’d compare against. A very simple operation, but because of how many characters we have and all of them going into the script system it was actually showing up as a large part of a character’s yearly update.

My solution was to simply move this to be done in code, the events and on action is now removed and instead there are two defines for the thresholds of gaining and losing the modifiers. As an upside to this by doing it in code we would now always correctly update and apply the modifiers automatically when the change_current_weight is used so it feels more responsive to special events.

4_Weight_Defines.png

[New weight defines]
5_Weight_Apply.png

[Weight modifiers being added based on current weight]

Combat phase events​

Something that was in script but required a larger rework was our combat events, every few days during a combat in the main phase we would run events for the commander and all the knights on both sides of the combat. These would then sometimes pick weighted random events to have such as a knight being wounded or your commander dying.

However these too went through the on action system and into the event script, so they had to be done in serial. Which as mentioned in Meneth’s previous post on the setup of our game state is not what we want, especially for something like this where the heavy part is computing which outcomes to have not to apply the effects of the outcome.

So I made a bespoke system for this which eliminates a large amount of its overhead, previously the combat manager took 12.5% of our serial tick update and now it's down to under 5%.

This bespoke system is that we now have a common/combat_phase_events database which scripts in the different entries containing their triggers, weight to happen, and effects if picked. Then in the combat manager threaded pre-update we can in parallel evaluate the triggers and relative changes of these happening for all the involved characters and queue them up to be executed in the main update. And if nothing is picked, which is a very common case for combats as we do not do something every single day it'd be spammy, then we just do nothing for that character in that daily tick and skip them entirely saving us even more work.

Dynamic coat of arms​

Another system handled heavily in script that I moved to a bespoke code system is our usage of dynamic coats of arms. For a variety of titles, primarily those in the British Isles and Scandinavia though also some specific historical cases like Norman vs Saxon England, we had special coats of arms that would switch in and out based on different conditions.

As another system in script this was missing the ability to do quicker evaluation and application of changes. I added a new system in common/coat_of_arms/dynamic_definitions which allows scripting a method of picking different coat of arms for a title based on triggers.

To call an update you just need to run the update_dynamic_coa = yes effect on a title when you change something that has a good chance of needing an update, the advantage of changing it to this approach is that for the majority of titles we know they have no special coat of arms so can early out very quickly and do nothing, for titles that do we can check the conditions in code which is quicker than trying to from other pieces of script and then apply the changes. We can then only trigger the UI refresh if we actually changed to something else instead of triggering it too often.

6_Dynamic_CoA.png

[England's dynamic coat of arms definition]

Here is the definition for England of whether it should switch to the French or Norman patterns, if it should go to neither then it uses the default coat of arms for the title which is the Saxon variant.

Cache Misses​

One of the more lower level types of performance problems you can get is cache misses, to explain this requires a brief lesson in how a computer’s memory works (don’t worry I’ll aim to keep it straightforward!)

The basic idea is that in modern computers there are different types of memory, smaller quicker ones and larger slower ones. The bigger slow ones are your main memory, it can contain a lot of information but fetching data from it is hugely slow.
So to save you on that it will fetch data from main memory and put it into increasingly faster yet smaller caches, called things like the L1, L2 or L3 cache with 1 being the smallest but fastest, so you can access it quickly.
The smallest thing your processor uses are its registers that contain truly small values (just 32/64 bits of data) but are super fast to access for temporary storage in functions.

When you access a piece of data not in your cache your CPU will fetch it for you, and try to smartly get surrounding data with it as chances are you want to use that too, and put it into the cache for you to use. This is a cache miss and it can be very slow if it needs to dig through all your caches and go to main memory.

7_Caches.png

[Image showing caches vs main memory I found off of google cause my photoshop skills are too bad to make this myself]

Why is this an issue? Well to put it into perspective things can be measured in CPU cycles, how long it takes your CPU to do one simple operation. Getting a register is basically nothing, going to L1 cache is 3 or 4 cycles, L2 is 20+, then going to something like main memory is 200+ cycles.
In comparison to actual CPU operations: something like adding two values is 1 cycle, an operation like sqrt which is costly and often avoided where possible is only around 15 cycles, your sin/cost/tan operations are often similar in terms of a main memory read which for an operation is a lifetime.

To put it into an analogy, imagine you are baking a chocolate brownie and you need to get a piece of chocolate. A register is you having a piece of chocolate in your hand, L1 cache would be a piece of chocolate on your kitchen counter, L2 cache would be a chocolate bar on your counter you need to unwrap and break a piece off, L3 cache would be walking over to your cupboard to find a chocolate bar to use, and going to main memory would be you leaving and going to the store to buy some chocolate.

(Can you tell I was kinda hungry for chocolate whilst writing this post?)

So cache misses can hurt you, and I actually noticed that as part of our character manager pre-update we were having some piece of fairly benign code show up as really really slow despite being a simple operation.
Every character can have scripted variables which is something that the script system can manipulate for their events, in the pre-update we would build a list of all characters with active variables so that in the serial update we could tick them and if they timed out remove them.

Getting to those scripted variables however involved us doing a lot of pointer chasing, a pointer is just something that points to another location in memory that you can get data from and often it's a location somewhere very far away from where you already are. So as we just learnt that is likely a cache miss as you’ve got to bring the data it points at into cache before working on it.

We were following a chain of 5 pointers for that, whoops...

The solution to this was to pull out the scripted variable handling for all game objects into a new manager. This manager would have the data directly so in its pre-update of determining which to update we would get much less cache misses and also balance the load better since the character manager no longer had to handle it and they could be done at the same time. It would also then apply this smarter skipping of updating empty variable storage to all objects instead of just the characters manually doing it as an extra win. The game objects now simply have a lightweight stable handle to their variables for when they need to look it up in script execution and evaluation.

8_Variable_Manager.png

[The new script variable manager]

For the serial update we could then also take advantage of the fact we know that all the variables are independent of each other and do this part of the serial update heavily threaded as well. With a handy utility to make a class opt into getting and storing/loading from saves the data updating all our variables into one unified and more performant system was great.

Overall it took all of our serial variable updating down from 5.5% of the serial tick to 1.4%, the new pre-update was negligible in cost and it removed this slow down entirely from the character manager’s pre-update. So a nice win overall and demonstrates well that you’ve gotta keep in mind the actual hardware you work on and its limitations as well as the software. We're not writing code and games for some abstract machine despite what the C++ specification might claim we are doing, we are writing it for actual hardware sets which come with their own set of constraints and data to work with.

The Bare Necessities​

It might sound a bit obvious but the less and more simple code the computer has to execute the quicker it is at doing so. One area we’ve applied this to a lot in recent years is our usage of text, as anyone who is familiar with our game’s script files we’ve got a lot of text… like no really a LOT of text files.

And two of the problems with text read in from files is that it can be really really long and be meant to represent things other than a real word as alas things like integers, floating points and booleans exist and in code we want to just store the real value 42 not the text string “42”.

Most anything dynamic in length requires an unpredictable amount of memory, and we have to ask the operating system to give us that amount of memory dynamically at run time and when we’re done we give it back. Doing this is much slower compared to knowing what we want up front or using local registers and stack memory to store such things. And text can be unpredictably long especially from files of unknown length, so we always need to have this dynamic memory… or do we?

Turns out that we use a lot of the same key words all the time and having duplicates in memory of that is often not needed, so something we make fairly regular use of to reduce string memory usage is a global lookup table. It will store one instance of the string and give us some numeric identifier we can use to get the text again whenever we need it, as its a global resource it is guarded by a mutex but it uses a smarter read-write mutex because the majority of the time we are reading the text from it not adding a new one and multiple people can read it at the same time.

This helps to save out on unnecessary copies of text when storing things. Another utility for this is called a string view, it is a super small type which just contains a read only view of the text and a size. With that we can pass around lightweight views of full strings without needing to make copies of the dynamic text every single time, we can read from it just fine but we cannot modify it which is often exactly what we want as we’ve got no reason to modify text we read in from a file. We can also alter the view we have such as splitting based on some delimiter, which if you've seen our script for chaining between different objects you'll see we use this and split on "." in something like "root.father.mother".

Now of course as I said at the beginning of this one must always measure these things, since strings vs views is a very isolated piece of code I could make use of the prior mentioned google benchmark to make sure that this theory holds up in practice.

9_String_Benchmark.png

[Functions to benchmark, notice one uses another string the other uses a view otherwise they are identical]
10_Benchmark_Results.png

[Timing output of this benchmark in nano seconds]

As you can see, making a string view is much much much faster than making a full copy despite how similar the code of the two look. And in most use cases that is all we need so we’ve proliferated that throughout our code base a lot.

The other case I mentioned is converting from a string to the number it is trying to represent, which is not a monumentally huge problem nor unsolved. Various coding languages and standard libraries have functions to convert from a string to an integer and back again. However often they do so in a non optimal manner of scanning text and taking into account different text locales or when converting to text by making full strings that as we’ve just seen are comparatively slow and with complicated instructions under the hood.

C++17 introduced the idea of a set of minimal overhead functions to do these conversions that would do no dynamic memory usage and be as simple as possible in their conversion for plain ASCII text. Now we don’t use C++17 yet (soon though I hope) due to some third party dependencies that have not updated and that we’ve not replaced. But nothing in this C++17 code required any features that C++14 does not provide, so I took a round at implementing it (with solid inspiration from various C++17 implementations) and also tied it into some of our custom numeric types.

Of course I also unit tested its various success and failure cases:
11_From_Chars.png

[Extract from unit test code handling the string conversion odd cases and how they should fail]

Aand thankfully the thing everyone wants to see pops up for me when running these tests, making me happy that my code isn't total trash (just somewhat trash in some places).
12_Unit_Test_Pass.png


This proved to be much better at crunching numbers from our text files compared to what we had used before, not as big a win as other startup optimisations since too much in serial is more of a blocker than some allocation and parsing slowness, but getting it out of the way made other issues with our text parsing easier to notice since it was now not bloated up by our general string usage here.
I also wrapped it into a more convenient interface of taking in/returning one of our aforementioned string views since the actual to/from_chars implementation is more of a low level building block and most of the time when consuming it we don’t care why it failed just that it gave us a number or not as an optional value.

And that is all for this week folks! Next week I will be digging into the script system as a whole and how that works to let our Content Designers and modding community make all the cool stuff they do which fills our game up with fun things!
 
  • 45Like
  • 23
  • 18Love
  • 1
Reactions:

Ajadaz

Recruit
47 Badges
Apr 16, 2020
2
9
  • Europa Universalis IV: Golden Century
  • Europa Universalis IV: Third Rome
  • Surviving Mars
  • Hearts of Iron IV: Death or Dishonor
  • Cities: Skylines - Green Cities
  • Europa Universalis IV: Cradle of Civilization
  • Hearts of Iron IV: Expansion Pass
  • Europa Universalis IV: Rule Britannia
  • Surviving Mars: Digital Deluxe Edition
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Surviving Mars: First Colony Edition
  • Europa Universalis IV: Mandate of Heaven
  • Imperator: Rome Deluxe Edition
  • Imperator: Rome
  • Hearts of Iron IV: Expansion Pass
  • Surviving Mars: First Colony Edition
  • Age of Wonders: Planetfall
  • Age of Wonders: Planetfall Deluxe edition
  • Hearts of Iron IV: La Resistance
  • Imperator: Rome - Magna Graecia
  • Crusader Kings III
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Cities: Skylines Deluxe Edition
  • Europa Universalis IV: Art of War
  • Europa Universalis IV: Conquest of Paradise
  • Europa Universalis IV: Wealth of Nations
  • Europa Universalis IV: Call to arms event
  • Magicka
  • Europa Universalis IV: Res Publica
  • Victoria: Revolutions
  • Victoria 2
  • Victoria 2: A House Divided
  • Victoria 2: Heart of Darkness
  • Cities: Skylines
  • Europa Universalis IV
  • Europa Universalis IV: El Dorado
  • Europa Universalis IV: Pre-order
  • Europa Universalis IV: Common Sense
  • Europa Universalis IV: Cossacks
  • Cities: Skylines - Snowfall
  • Europa Universalis IV: Mare Nostrum
  • Hearts of Iron IV: Cadet
  • Europa Universalis IV: Rights of Man
  • Hearts of Iron IV: Together for Victory
  • Cities: Skylines - Mass Transit
Another great entry in this series. I don't work on a lot of C++ projects, but I find them fascinating and really interesting. C++ is a big beast of a language and if you want to find performance bottlenecks you have to understand exactly when a variable is doing an expensive copy, when you should use references or move semantics instead.
 
  • 3
  • 1Like
Reactions:

blackninja9939

Experienced Programmer - Crusader Kings 3
77 Badges
Aug 28, 2013
2.366
6.056
  • Crusader Kings III
  • Stellaris: Federations
  • Battle for Bosporus
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Crusader Kings II
  • Crusader Kings II: Holy Fury
  • Imperator: Rome - Magna Graecia
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Old Gods
  • Europa Universalis IV: Rights of Man
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Synthetic Dawn
  • Surviving Mars
  • BATTLETECH
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Colonel
  • Stellaris Sign-up
  • Hearts of Iron IV: Expansion Pass
  • Stellaris: Apocalypse
  • Stellaris: Lithoids
  • Stellaris: Distant Stars
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Imperator: Rome Deluxe Edition
  • Prison Architect
  • Imperator: Rome Sign Up
  • Stellaris: Ancient Relics
  • Age of Wonders: Planetfall
  • Crusader Kings II: Conclave
  • Crusader Kings II: The Republic
  • Victoria 2
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Crusader Kings II: Way of Life
  • Stellaris
  • Mount & Blade: Warband
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Legacy of Rome
the game is already optimized i'd rather see genuine progress with new content rather than pointless optimization patches
Tell that to comments on other socials complaining that the game still isn’t fast enough.

Feel like you’re not posting this in good faith, but I’ll indulge you anyway to use this as a teaching moment.

We are also working on new content and features, which as I stated at the start will not be displayed in this series over summer. We’ve not spent a year just optimising the game, it is but a part of what we spend our time as programmers on.

Every new feature and piece of content adds a cost to performance, so we are always going to be optimising the game so that we try to stay the same or ideally improve on performance every patch regardless of how much we’re adding new features. Cause I guarantee you’d be a lot more upset if we release royal court with much worse performance and bugs than before than waiting a bit more to see a dev diary on it.
 
  • 31
  • 18Like
  • 3
  • 1Love
Reactions:

blackninja9939

Experienced Programmer - Crusader Kings 3
77 Badges
Aug 28, 2013
2.366
6.056
  • Crusader Kings III
  • Stellaris: Federations
  • Battle for Bosporus
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Crusader Kings II
  • Crusader Kings II: Holy Fury
  • Imperator: Rome - Magna Graecia
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Old Gods
  • Europa Universalis IV: Rights of Man
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Synthetic Dawn
  • Surviving Mars
  • BATTLETECH
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Colonel
  • Stellaris Sign-up
  • Hearts of Iron IV: Expansion Pass
  • Stellaris: Apocalypse
  • Stellaris: Lithoids
  • Stellaris: Distant Stars
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Imperator: Rome Deluxe Edition
  • Prison Architect
  • Imperator: Rome Sign Up
  • Stellaris: Ancient Relics
  • Age of Wonders: Planetfall
  • Crusader Kings II: Conclave
  • Crusader Kings II: The Republic
  • Victoria 2
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Crusader Kings II: Way of Life
  • Stellaris
  • Mount & Blade: Warband
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Legacy of Rome
Another great entry in this series. I don't work on a lot of C++ projects, but I find them fascinating and really interesting. C++ is a big beast of a language and if you want to find performance bottlenecks you have to understand exactly when a variable is doing an expensive copy, when you should use references or move semantics instead.
There are a few things there to keep track of yeah, there are a few good rules of thumb but as with all good rules also enough exceptions and special idioms to make it annoying!
 
  • 3Like
  • 1
Reactions:

Tiax

Lt. General
42 Badges
Jun 7, 2007
1.674
5.314
  • Crusader Kings II: Reapers Due
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Europa Universalis IV: Pre-order
  • Crusader Kings II: Way of Life
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Horse Lords
  • Cities: Skylines - After Dark
  • Europa Universalis IV: Cossacks
  • Crusader Kings II: Conclave
  • Stellaris
  • 500k Club
  • Europa Universalis IV: Rights of Man
  • Crusader Kings II: Monks and Mystics
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings Complete
  • Europa Universalis IV: Cradle of Civilization
  • Crusader Kings II: Jade Dragon
  • Europa Universalis IV: Dharma
  • Crusader Kings II: Holy Fury
  • Crusader Kings III
  • Europa Universalis IV: Art of War
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: The Old Gods
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: The Republic
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: Sword of Islam
  • Europa Universalis III
  • Divine Wind
  • Europa Universalis IV
  • Crusader Kings II
  • Europa Universalis IV: Conquest of Paradise
  • Europa Universalis IV: Wealth of Nations
  • Europa Universalis III Complete
  • Europa Universalis III Complete
  • Europa Universalis IV: Res Publica
  • Victoria 2
  • Victoria 2: A House Divided
  • Victoria 2: Heart of Darkness
  • Warlock: Master of the Arcane
One thing I'm always surprised about when dealing with the script files is how verbose they are - they often include hundreds of lines of code to do something seemingly simple. (As an example, the events file for low-fervor heresy outbreaks manages to spend over a thousand lines of code on some really simple stuff). The amount of duplicated code in these files seems like it would have nasty implications for maintainability, but does it also have an impact on load times and script execution times?
 
Last edited:
  • 4
  • 1Like
  • 1
Reactions:

fodazd

Colonel
37 Badges
Feb 21, 2018
907
2.427
  • Crusader Kings II
  • Age of Wonders: Planetfall
  • Crusader Kings II: Jade Dragon
  • Stellaris: Humanoids Species Pack
  • Stellaris: Apocalypse
  • Stellaris: Distant Stars
  • Stellaris: Megacorp
  • Crusader Kings II: Holy Fury
  • Prison Architect
  • Stellaris: Ancient Relics
  • Stellaris: Synthetic Dawn
  • Stellaris: Lithoids
  • Age of Wonders: Planetfall - Revelations
  • Stellaris: Federations
  • Crusader Kings III
  • Crusader Kings III: Royal Edition
  • Island Bound
  • Stellaris: Necroids
  • Stellaris: Nemesis
  • Surviving Mars
  • Crusader Kings II: Way of Life
  • Crusader Kings II: Horse Lords
  • Crusader Kings II: Conclave
  • Stellaris
  • Crusader Kings II: Reapers Due
  • Stellaris: Digital Anniversary Edition
  • Stellaris: Leviathans Story Pack
  • Crusader Kings II: Monks and Mystics
  • Stellaris - Path to Destruction bundle
  • Age of Wonders III
  • Crusader Kings II: Sword of Islam
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Republic
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: The Old Gods
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: Charlemagne
I remember that I did optimizations on some non-game related simulation code a looong time ago, and performance behaves in really strange ways a lot of the time. Sometimes, recalculating a value every time is a lot faster than having it cached somewhere, and sometimes it's the opposite. Sometimes, adding more threads makes it significantly *slower* than it used to be. I am definitely no expert when it comes to performance.

One thing that I remember actually speeding things up was that instead of checking whether it was time to calculate a future event in each and every "round" of the simulation, there was a collection of future events sorted by the time they were supposed to occur, and once we had that, we only ever needed to check for the smallest time in that sorted collection, giving us effectivly a factor n speedup on the number of known future events. I don't know to what degree that would be possible or effective in CK3 of course.

Back then, I also heard that working with strings at all is inherently very slow, and that I should therefore avoid it whenever possible. But it seems like there actually *are* some ways to make it fast. :)
 
  • 3
  • 1Like
Reactions:

Silversweeeper

Ichi no Hito
58 Badges
Aug 24, 2012
3.626
2.343
  • Crusader Kings II: Monks and Mystics
  • Pillars of Eternity
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Cossacks
  • Crusader Kings II: Conclave
  • Stellaris
  • Stellaris: Galaxy Edition
  • Stellaris: Galaxy Edition
  • Crusader Kings II: Reapers Due
  • Europa Universalis IV: Rights of Man
  • Stellaris: Digital Anniversary Edition
  • Stellaris: Leviathans Story Pack
  • Europa Universalis IV: Common Sense
  • Stellaris - Path to Destruction bundle
  • Europa Universalis IV: Mandate of Heaven
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Humanoids Species Pack
  • Stellaris: Apocalypse
  • Stellaris: Distant Stars
  • Stellaris: Megacorp
  • Crusader Kings II: Holy Fury
  • Imperator: Rome Deluxe Edition
  • Stellaris: Ancient Relics
  • Imperator: Rome Sign Up
  • Crusader Kings II: Way of Life
  • Magicka: Wizard Wars Founder Wizard
  • Europa Universalis IV: El Dorado
  • 500k Club
  • Europa Universalis IV
  • Europa Universalis III Complete
  • Stellaris: Lithoids
  • Stellaris: Federations
  • Europa Universalis III Complete
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV: Mare Nostrum
  • Crusader Kings II: Charlemagne
  • Stellaris Sign-up
  • Europa Universalis IV: Third Rome
  • Europa Universalis IV: Res Publica
  • Crusader Kings II
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: The Old Gods
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: The Republic
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Sword of Islam
  • Europa Universalis III Complete
  • Magicka
We are also working on new content and features, which as I stated at the start will not be displayed in this series over summer. We’ve not spent a year just optimising the game, it is but a part of what we spend our time as programmers on.

Every new feature and piece of content adds a cost to performance, so we are always going to be optimising the game so that we try to stay the same or ideally improve on performance every patch regardless of how much we’re adding new features. Cause I guarantee you’d be a lot more upset if we release royal court with much worse performance and bugs than before than waiting a bit more to see a dev diary on it.

Considering some other PDS games have had issues due to being poorly optimised and/or performance improvements not keeping pace with new content, it's good that you're being proactive about it, and I'd imagine the Content Designers appreciate any performance that's freed up and that can be used for stuff they want to add.

Also, based on what has been mentioned at various times about different roles, I don't believe "new content" (at least in the sense it's presumably used) is on the Programmers' table (at least outside of creating necessary functionality) regardless of whether you'd be capable of it or not, so whatever you'd have been working on if you hadn't been working on optimisation would presumably not have contributed (directly) to "new content"...
 
  • 1
Reactions:

blackninja9939

Experienced Programmer - Crusader Kings 3
77 Badges
Aug 28, 2013
2.366
6.056
  • Crusader Kings III
  • Stellaris: Federations
  • Battle for Bosporus
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Crusader Kings II
  • Crusader Kings II: Holy Fury
  • Imperator: Rome - Magna Graecia
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Old Gods
  • Europa Universalis IV: Rights of Man
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Synthetic Dawn
  • Surviving Mars
  • BATTLETECH
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Colonel
  • Stellaris Sign-up
  • Hearts of Iron IV: Expansion Pass
  • Stellaris: Apocalypse
  • Stellaris: Lithoids
  • Stellaris: Distant Stars
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Imperator: Rome Deluxe Edition
  • Prison Architect
  • Imperator: Rome Sign Up
  • Stellaris: Ancient Relics
  • Age of Wonders: Planetfall
  • Crusader Kings II: Conclave
  • Crusader Kings II: The Republic
  • Victoria 2
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Crusader Kings II: Way of Life
  • Stellaris
  • Mount & Blade: Warband
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Legacy of Rome
One thing I'm always surprised about when dealing with the script files is how verbose they are - they often include hundreds of lines of code to something seemingly simple. (As an example, the events file for low-fervor heresy outbreaks manages to spend over a thousand lines of code on some really simple stuff). The amount of duplicated code in these files seems like it would have nasty implications for maintainability, but does it also have an impact on load times and script execution times?
Really its a problem of scale for the script, no single event by itself is a big issue, its a slow event happening very often that is the problem in terms of performance. For example faith update could be slow but there is a very finite amount of faiths. A character update being slow or triggered often is a much larger problem, the dynamic coat of arms are an example there as so many things could tie into it and it was slow already.

For load times its also the scale of more and more script and files, no individual event is much of an issue since we load it all in parallel these days, its just when the sheer number of events outdoes the number of threads we've got to do the work.

As part of the startup optimisations I mentioned in the first post of the series I also did some optimisations to scripted triggers and effects which largely brought the event specific load time down. Scripted effects and triggers function effectively like macros right now, they expand in place and literally paste all their contents, for 1.5 I've made them work a bit more like a function template in that they are expanded for their arguments once and then everything else just holds onto a reference to that one instance to operate.

That cut down on the reading time of all events (and things using scripted effects and triggers in general but events were the largest) as now we are not generating them in place every time we read one we just note down the arguments and then can generate the unique versions once after loading everything. But events are still a decent part of the database loading since there is just so many of them everywhere.

I remember that I did optimizations on some non-game related simulation code a looong time ago, and performance behaves in really strange ways a lot of the time. Sometimes, recalculating a value every time is a lot faster than having it cached somewhere, and sometimes it's the opposite. Sometimes, adding more threads makes it significantly *slower* than it used to be. I am definitely no expert when it comes to performance.

One thing that I remember actually speeding things up was that instead of checking whether it was time to calculate a future event in each and every "round" of the simulation, there was a collection of future events sorted by the time they were supposed to occur, and once we had that, we only ever needed to check for the smallest time in that sorted collection, giving us effectivly a factor n speedup on the number of known future events. I don't know to what degree that would be possible or effective in CK3 of course.

Back then, I also heard that working with strings at all is inherently very slow, and that I should therefore avoid it whenever possible. But it seems like there actually *are* some ways to make it fast. :)
Adding more threads getting slower is always fun, that is what a lot of the startup optimisations eventually ran into of blocks because they were trying to lock the same mutually exclusive resources. Also for a small enough set of work the overhead of creating and using threads can sometimes remove the benefit entirely.

Calculating future rounds is pretty impossible to to effectively due to the sheer amount of player and AI interactions that can change the whole game state, we could calculate more moves ahead but we'd be discarding them so often and storing a lot of extra data. It'd be nice if we could work a way out there, I doubt its fully impossible, but we do not have a small enough scope of rules to do that I doubt.

Don't get me wrong strings are still definitely very slow, its just something that is unavoidable with text processing for input files and displaying text so you try to get every little bit you can out of the system!

Considering some other PDS games have had issues due to being poorly optimised and/or performance improvements not keeping pace with new content, it's good that you're being proactive about it, and I'd imagine the Content Designers appreciate any performance that's freed up and that can be used for stuff they want to add.

Also, based on what has been mentioned at various times about different roles, I don't believe "new content" (at least in the sense it's presumably used) is on the Programmers' table (at least outside of creating necessary functionality) regardless of whether you'd be capable of it or not, so whatever you'd have been working on if you hadn't been working on optimisation would presumably not have contributed (directly) to "new content"...
I was using content in the loose sense of just "new stuff" added to the game as opposed to specifically us adding events or anything as yeah we don't do that you are correct!
Though when editing these databases and making new ones I was editing plenty of text files and doing some scripting myself since there is no real need to delay that as a separate task for a Content Designer, unless its gonna be too big or complex, when I also know how to script mechanically even if my flavour writing skills have gotten a bit (read very) rusty from back when I used to be a Content Designer.
 
  • 10Like
  • 5
Reactions:

Dormouse

Recruit
48 Badges
Feb 18, 2017
5
4
  • Crusader Kings II
  • Victoria 2: Heart of Darkness
  • Victoria 2: A House Divided
  • Semper Fi
  • Hearts of Iron III: Their Finest Hour
  • Hearts of Iron III
  • For the Motherland
  • Arsenal of Democracy
  • Darkest Hour
  • Crusader Kings II: Sword of Islam
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Republic
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: The Old Gods
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: Charlemagne
  • Crusader Kings III: Royal Edition
  • Surviving Mars
  • Age of Wonders III
  • Crusader Kings II: Jade Dragon
  • Surviving Mars: Digital Deluxe Edition
  • Cities: Skylines - Parklife
  • Shadowrun Returns
  • Shadowrun: Dragonfall
  • Crusader Kings II: Holy Fury
  • Imperator: Rome
  • Imperator: Rome Sign Up
  • Age of Wonders: Planetfall
  • Age of Wonders: Planetfall Deluxe edition
  • Age of Wonders: Planetfall Sign Up
  • Crusader Kings III
  • BATTLETECH
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Cadet
  • Stellaris
  • Crusader Kings II: Conclave
  • Cities: Skylines - After Dark
  • Crusader Kings II: Horse Lords
  • Crusader Kings II: Way of Life
  • Cities: Skylines
  • Victoria 2
  • Europa Universalis IV
  • Stellaris - Path to Destruction bundle
  • Stellaris: Leviathans Story Pack
  • BATTLETECH: Flashpoint
This update was great! Thank you so much. =)

I really appreciate the peak under the hood and the explanations of how things work and how you are going about optimizing the code and boosting performance.

Thank you for all the work you folk do. Can't wait to see the results of it.
 
  • 1Love
Reactions:

Sebastián Starck

Recruit
39 Badges
May 25, 2015
6
2
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II
  • Hearts of Iron III
  • Impire
  • Knights of Pen and Paper +1 Edition
  • Magicka
  • Sengoku
  • Ship Simulator Extremes
  • Stellaris: Synthetic Dawn
  • Stellaris: Necroids
  • Stellaris: Humanoids Species Pack
  • Stellaris: Apocalypse
  • Stellaris: Distant Stars
  • Cities: Skylines Industries
  • Stellaris: Megacorp
  • Imperator: Rome Deluxe Edition
  • Imperator: Rome
  • Prison Architect
  • Stellaris: Ancient Relics
  • Stellaris: Lithoids
  • Stellaris: Federations
  • Imperator: Rome - Magna Graecia
  • Crusader Kings III
  • Cities: Skylines - Mass Transit
  • Stellaris - Path to Destruction bundle
  • Stellaris: Leviathans Story Pack
  • Stellaris: Digital Anniversary Edition
  • Stellaris Sign-up
  • Stellaris: Galaxy Edition
  • Stellaris: Galaxy Edition
  • Stellaris: Galaxy Edition
  • Stellaris
  • Cities: Skylines
  • War of the Roses
  • Victoria 2: Heart of Darkness
  • Victoria 2: A House Divided
  • Victoria 2
  • Teleglitch: Die More Edition
  • Europa Universalis IV
I'm loving those in-deep posts, even if I'm a pleb who doesn't work with the C family!

Do you measure performance for different specs? I've personally experienced the big difference of in-game-time calculation through both my notebook and PC.

Not related to the post per se, but, how did you move from Content Designer to Programmer?
Do CD folks script all the content they add?
 
Last edited:

blackninja9939

Experienced Programmer - Crusader Kings 3
77 Badges
Aug 28, 2013
2.366
6.056
  • Crusader Kings III
  • Stellaris: Federations
  • Battle for Bosporus
  • Stellaris: Nemesis
  • Stellaris: Necroids
  • Europa Universalis IV
  • Crusader Kings III: Royal Edition
  • Europa Universalis 4: Emperor
  • Crusader Kings II
  • Crusader Kings II: Holy Fury
  • Imperator: Rome - Magna Graecia
  • Crusader Kings II: Charlemagne
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: The Old Gods
  • Europa Universalis IV: Rights of Man
  • Europa Universalis IV: Cradle of Civilization
  • Stellaris: Synthetic Dawn
  • Surviving Mars
  • BATTLETECH
  • Europa Universalis IV: Mandate of Heaven
  • Crusader Kings II: Monks and Mystics
  • Tyranny: Archon Edition
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Reapers Due
  • Hearts of Iron IV: Colonel
  • Stellaris Sign-up
  • Hearts of Iron IV: Expansion Pass
  • Stellaris: Apocalypse
  • Stellaris: Lithoids
  • Stellaris: Distant Stars
  • Europa Universalis IV: Dharma
  • Shadowrun Returns
  • Imperator: Rome Deluxe Edition
  • Prison Architect
  • Imperator: Rome Sign Up
  • Stellaris: Ancient Relics
  • Age of Wonders: Planetfall
  • Crusader Kings II: Conclave
  • Crusader Kings II: The Republic
  • Victoria 2
  • Cities: Skylines
  • Europa Universalis IV: El Dorado
  • Crusader Kings II: Way of Life
  • Stellaris
  • Mount & Blade: Warband
  • Crusader Kings II: Horse Lords
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: Legacy of Rome
Great context! A small question though: why is "+5" illegal as a number? Shouldn't it return 5?
You can blame the C++ standards committee for that one, I’d guess because positive is default and specifying as much for a value is pointless especially because you then need to handle the sign correctly for both signed and unsigned integer type, of which the latter would be weird to allow a + cause you can’t have it be negative anyway

I'm loving those in-deep posts, even if I'm a pleb who doesn't work with the C family!

Do you measure performance for different specs? I've personally experienced the big difference of in-game-time calculation through both my notebook and PC.

Not related to the post per se, but, how did you move from Content Designer to Programmer?
Do CD folks script all the content they add?
We do run benchmarks on different setups yes, gotta try to make sure our min and recommend specs stay true of course!

CDs do script all the content they add yeah. Which when I was a content designer before I moved to CK3 I always found myself enjoying the technical aspect of scripting a lot so I learned how to code in C++ as I knew that’s what we used and then took my shot at implementing pieces of new scripting functionality and even some minor features, all with the support of other programmers and my tech lead at the time which was invaluable.
Eventually I found myself enjoying coding more than my Content Design job so worked with my manager to change roles, then after a while I moved to CK3 and here I am now!
 
  • 4
  • 3Like
Reactions:

Meneth

Crusader Kings 3 Programmer
128 Badges
Feb 9, 2011
10.056
5.358
www.paradoxwikis.com
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Holy Knight (pre-order)
  • Crusader Kings II
  • Crusader Kings II: Sword of Islam
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: The Republic
  • Teleglitch: Die More Edition
  • Crusader Kings II: Conclave
  • Hearts of Iron IV Sign-up
  • Surviving Mars
  • Stellaris: Galaxy Edition
  • 500k Club
  • Hearts of Iron IV: Colonel
  • Europa Universalis IV: El Dorado
  • Europa Universalis IV: Cradle of Civilization
  • Magicka: Wizard Wars Founder Wizard
  • Mount & Blade: Warband
  • Mount & Blade: With Fire and Sword
  • Crusader Kings II: Way of Life
  • Stellaris: Digital Anniversary Edition
  • Hearts of Iron IV: Death or Dishonor
  • Hearts of Iron IV: Field Marshal
  • Age of Wonders III
  • Europa Universalis III Complete
  • Crusader Kings II: The Old Gods
  • Hearts of Iron IV: Cadet
  • Steel Division: Normand 44 Sign-up
  • Crusader Kings Complete
  • Europa Universalis IV
  • Hearts of Iron IV: Expansion Pass
  • Hearts of Iron IV: Expansion Pass
  • Cities: Skylines - Parklife
  • Prison Architect
  • Stellaris: Leviathans Story Pack
  • Crusader Kings II: Reapers Due
  • Stellaris Sign-up
  • Europa Universalis III Complete
  • Europa Universalis IV: Mandate of Heaven
  • Europa Universalis IV: Rule Britannia
  • Crusader Kings II: Sons of Abraham
  • Crusader Kings II: Rajas of India
  • Crusader Kings II: Charlemagne
  • Magicka 2 - Signup Campaign
  • Knights of Pen and Paper 2
  • Europa Universalis IV: Mare Nostrum
  • Knights of Honor
  • Deus Vult
  • Victoria 2: A House Divided
  • Victoria 2: Heart of Darkness
This is the first time defragmenting has ever made sense as to why it speeds up your computer.
Defragmenting specifically is only useful for HDDs. For SSDs it is recommended that you don't defragment, as all that will do is reduce the lifetime of the SSD.
The minimum fragment size is such that for an SSD it doesn't matter at all if the fragments are contiguous; it'll take the same time if a file is 1 piece of 1000.

For a HDD on the other hand it can make a huge difference, as there's a read head physically going along grooves in the disk. So if a file is one big piece, that's all just one long contiguous read without the read head having to jump around. While if it is in multiple pieces the read head needs to physically move from the end of one fragment to the start of the next. Heavy fragmentation can therefore result in significantly slower reads.

It definitely has a bit in common with cache misses, since as Matt mentions, when we grab a piece of memory we also get the surrounding memory. But the scale involved is very different. A cache line is generally 64 bytes, while most file systems won't even allow a fragment that small.
 
  • 6
  • 1Like
Reactions: