Anatomy of a Game: Changing the Gamestate

  • We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Meneth

Crusader Kings 3 Programmer
153 Badges
Feb 9, 2011
10.056
5.388
www.paradoxwikis.com
  • Europa Universalis IV: Common Sense
  • Crusader Kings II: Holy Knight (pre-order)
  • Crusader Kings II
  • Crusader Kings II: Sword of Islam
  • Crusader Kings II: Legacy of Rome
  • Crusader Kings II: Sunset Invasion
  • Crusader Kings II: The Republic
  • Hearts of Iron IV: Expansion Pass
  • Steel Division: Normand 44 Sign-up
  • Stellaris: Digital Anniversary Edition
  • Crusader Kings II: Way of Life
  • Mount & Blade: With Fire and Sword
  • Mount & Blade: Warband
  • Magicka: Wizard Wars Founder Wizard
  • Hearts of Iron IV: Death or Dishonor
  • Europa Universalis IV: El Dorado
  • Hearts of Iron IV: Colonel
  • Hearts of Iron IV: Field Marshal
  • Surviving Mars: Digital Deluxe Edition
  • BATTLETECH: Flashpoint
  • Crusader Kings II: Conclave
  • Surviving Mars
  • Cities: Skylines Industries
  • Stellaris: Galaxy Edition
  • BATTLETECH
  • Hearts of Iron IV Sign-up
  • Stellaris Sign-up
  • Hearts of Iron IV: Cadet
  • Stellaris: Humanoids Species Pack
  • Prison Architect
  • Crusader Kings II: The Old Gods
  • Cities: Skylines - Campus
  • Hearts of Iron IV: No Step Back
  • BATTLETECH - Digital Deluxe Edition
  • Crusader Kings Complete
  • Cities: Skylines - Parklife
  • Europa Universalis IV
  • Age of Wonders III
  • Hearts of Iron IV: Expansion Pass
  • Europa Universalis IV: Cradle of Civilization
  • 500k Club
  • Stellaris: Leviathans Story Pack
  • Crusader Kings II: Reapers Due
  • Europa Universalis IV: Mandate of Heaven
  • Europa Universalis III Complete
  • Cities: Skylines - Mass Transit
  • Europa Universalis III Complete
  • Cities: Skylines - Green Cities
  • Teleglitch: Die More Edition
  • Europa Universalis IV: Rule Britannia
Good afternoon everyone! My name is Magne Skjæran, and I’m a senior programmer on Crusader Kings 3. I’m here to bring you the second entry in the Anatomy of a Game series. You can read the first entry by Matthew here on our Startup and Loading.
Today’s topic will be how we change the gamestate. This is core to the whole simulation of the game, since if nothing changes there’s no real game to play.

I’ll be covering three main topics here. First I’ll talk about the command system, which is how all interaction with the game happens. Then about how we determine what to change vs. how we actually change it. Then finally, about what out of syncs are and how they occur.

What I cover here will all be based on CK3, but a lot of it applies to our other games as well. But there will be differences here and there; sometimes big ones! So don’t take any of this as gospel for our other games.

Command System​

The core of our game is a simulation. It runs on its own even if no agent (the player or AI) makes any changes to it. But a simulation you can’t influence is just a toy rather than an actual game. This is where commands come in.

A command is a set of data on how to change the gamestate. A simple example would be a command to “queue movement of this unit to that province”, or “select this event option”.
All interaction happening via command also makes it easy for us to find everything the player can influence, which makes a variety of bugs easier to debug, and makes it easier for us to reason about how the game works.

Disband_Army_Command.png

[A command to disband an army]

The command system also forms the basis of multiplayer. Anything a player does is communicated to the other players’ machines by sending a command over the network. Forcing all interaction into the system therefore makes multiplayer Just Work™ in the vast majority of cases without us having to write any MP-specific code. When a programmer implements a new system, it is rare to have to think much at all about much at all about multiplayer (while the designer probably needs to give it some thought to make sure the feature is fun both in SP and MP).

The player’s interaction happens via the interface, unsurprisingly. The interface is a separate module from the actual game logic; it covers things like what to show, and how they’re interacted with. The game logic can’t see the existence of the interface at all in the code, which avoids a whole class of bugs where logic in some way depends on the interface, an issue that would occasionally happen in our older games. The interface is only able to read the gamestate, and this is enforced by the code systems we have. Commands are the only way for the interface to affect the gamestate.

Posting_Commands.png

[The code to send a command from the UI]

Since the gamestate cannot see the existence of the interface, this means that it is hard to communicate with the interface. Naturally, this can pose a problem. For instance, imagine something happens to the player and we want to send a notification about it; for instance if the player goes up a prestige level. Sure, the interface could store the player’s prestige level and then generate the notification when it changes, but this ends up duplicating a ton of state between the logic and the interface. So instead we have a system similar to commands for sending information from the logic to the interface, which we call “messages”. Like commands, these are specific pieces of information that the interface is to act upon in some manner. They get handled in the interface, so when the logic sends the message “player increased their prestige level”, the interface then takes care of actually showing that notification to the player.

Now, that’s enough about the player. What about the AI? The AI plays by essentially the same rules. Anything that’s not happening via the simulation itself is done by command for the AI as well. Periodically, the AI considers the various actions it can take, and for each it decides to do it’ll send a command. Usually this is the exact same command a player would’ve sent; the player and AI will both use the same command for “move this unit to this province” for instance.
The AI and the player using the same system makes it easier for us to ensure the two play by the same rules. Even more importantly, that the same attempted action gives the same result, avoiding subtle bugs due to differences between how the AI and the player interact with the game.

AI_Posting_Command.png

[The AI sending a command]

There’s not that much more to cover about how the AI interacts with the game without going into far more detail of the AI systems themselves, which could easily be a dev diary of its own, so I’ll move on now.

Evaluation and execution​

All changes to the gamestate can be considered to have two main parts: deciding what to do (evaluation) and actually doing it (execution). In a large number of cases, the thing that takes the most time is to figure out what to do, not to actually do it. For instance, choosing which event to fire out of hundreds available in a yearly pulse takes longer than applying the event we decide upon.

Generally speaking, we can only execute one thing at a time (otherwise we get out of syncs; more on that later). We can however evaluate multiple things at the same time by using threads. So instead of each character individually in a row deciding what events to fire, we can consider 8 or more (depending on how many CPU threads the player has) characters’ events simultaneously. Each character then adds the events they’re going to fire to a queue. This part has to be synchronized as we can’t add two things to a list at the same time, but since the vast majority of the time is spent on the evaluation rather than adding to the list, we save huge amounts of time by distributing the work. Later we can then go through the queue and fire each event, removing any that’s no longer applicable for whatever reason (maybe a character involved died?) along the way.

This split between evaluation and execution is one of the cornerstones of how we do threading on CK3. The gamestate is split up into various “managers” that are each responsible for one part of the game. For example there’s a Secrets Manager, an Event Manager, a Character Manager, and a Title Manager. The main part of how we progress the game a single day is split into two parts; the pre-update and the main update. In the pre-update, each manager does its own evaluations and makes notes of things to do later. No visible gamestate is allowed to change, so each manager can safely look at things they don’t manage (E.G., the title manager is allowed to look at the holder of a title, even though it doesn’t own characters). Instead they can only change things that are invisible to the rest of the game (like that event queue mentioned earlier).

Pre_Update_Managers.png

[Time spent in the various pre-update managers]

The split makes it very easy for us to thread things, as there’s only one rule to follow (don’t modify any visible state). The threading on our older game came with far more rules to obey (only look at your own data, don’t look at this thing, don’t modify this other thing, etc.), meaning that for experienced and new programmers alike it was easy to make mistakes. With only a single rule mistakes are harder to make, easier to catch, and easier to fix. As a result we’re more productive, and CK3 is our most threaded game to date.

The AI works very similarly. It’s run after the main update rather than before it, but works on the same principle. The AI is not allowed to change anything except certain pieces of purely AI-internal state, and instead just sends commands. The AI is split up into a variety of sub-tasks, composed together based on a frequency basis. E.G., an individual AI will check whether it should change its laws and whether it should leave a faction in the same task, as these happen at the same frequency. Each such grouping of tasks can happen simultaneously with any other grouping of tasks. The granularity of this means that the threading of the AI is very effective (known as “good load-balancing”) as one thread is unlikely to finish its work significantly earlier than another thread (which would leave it idling).

AI_Threaded_Work.png

[The AI updating a set of tasks]

As mentioned earlier, the use of the command system means that the effects of the AI are nicely isolated from its decision-making process. This makes it easier to iterate upon, easier to reason about, and easier to optimize.
Now, let's move on to the final topic of today: out of syncs.

Out of Sync​

If you play multiplayer in any of our games you’re aware of a particularly dreaded set of words: “game is out of sync”. When this happens you’re unable to continue playing, and depending on the game have to either rehost or resync. But what is an out of sync (OOS), beyond us programmers having a laugh at your expense?

OOS_Message.png

[CK3 going out of sync]

To explain what an OOS is, I first need to explain how multiplayer itself works. In most games out there, the core of how multiplayer works is that the server (or a player’s machine acting as the server, if it is peer to peer) will tell all the clients the state of everything in the game. Where everything is, where it’s moving, how much health it has left, etc. Left out is usually only things that are static (what the map looks like in many games for instance). Competitive games often also leave out things that the client would have no way of knowing (like the position of another player on the other side of the map) to combat wallhack cheats and the like.
This is generally a very sensible model, but it breaks down if there’s too much gamestate to send over the network several times a second. In a first person shooter with 10, 20, maybe even 100 players all this info can be stored in a few kB, but CK3 for comparison usually has around 20 thousand characters, never mind everything else. The full gamestate of CK3 takes around 30 to 100 MB to store uncompressed, and even with compression you’ll easily pass 10-20 MB once you’re far enough in. Clearly, this is not something we can send over the network repeatedly.

So what do we do instead? We use an architecture known as “lockstep multiplayer”. This is common for strategy games. How this works is that instead of telling clients the state of everything (or a large subset of everything), we instead first provide them the initial state (in the form of a save), and then each client runs their own simulation. We send commands for player and AI interactions; everything else each client will calculate on their own. As a result far less info is sent over the network, since we only need to inform the clients of things that deviate from the natural flow of the simulation.

But here’s the problem: this means we have to ensure every single client simulates the game the exact same way. Because if anything differs, no matter how small, that tiny change will eventually Butterfly Effect its way to causing drastic differences between what’s happening on each machine. So while one player just got declared war on by some Vikings, on another client this wouldn’t be happening at all.

When anything differs, that’s an out of sync. At this point, major breakage is inevitable, and so we tell the players and force a rehost. This isn’t a great experience for anyone, so it is something we work hard on avoiding.

So how do out of syncs happen in the first place? It generally comes down to a lack of determinism. Determinism is when the same input always leads to the same result. As long as that’s the case, out of syncs are impossible (except if some input is lost or corrupted due to, say, network issues). But determinism isn’t easy.
It is simple enough if your game is single-threaded, but then it’ll also be slow. Any threading can introduce non-deterministic behavior if you’re not careful. The most common way is due to order issues. Let's say you’ve got the number X. It has a value of 10. Thread A wants to add 2 to it. Thread B wants to multiply it by 2. If Thread A happens to run first, the end result will be (10 + 2) * 2 = 24. But if Thread B runs first, it will be (10 * 2) + 2 = 22. So if for any reason threads run in a different order on two machines (maybe one CPU core was busy with something else for a split second), an out of sync will occur.

This is a big reason why we usually only multi-thread evaluation. If nothing is changed, then order doesn’t matter. We sometimes thread things that change visible state too, but that’s much rarer and we’re far more careful to ensure that ordering doesn’t matter.

Another cause of out of syncs that was far more common in our older games, was the interface influencing the gamestate in some manner. To take a simple example, imagine we have some value we only rarely update because it is really time consuming to update. But when the player looks at it, we want it to be fully up to date. It might be tempting to force it to update when the player opens the interface but oops… now you’ve introduced an out of sync.
The way we’ve structured CK3 makes it far more difficult to make this mistake, as it’s much harder to modify the gamestate from the interface. We’d instead send a command to refresh the value, and/or maybe do the actual math for the new value just in the interface and leave the gamestate untouched.

Similarly, it’s easy to introduce issues due to bits of game logic depending on if a character is the local player or not. E.G., we want to update the player’s predicted income daily rather than monthly to ensure the player’s info is up to date. The naive implementation here would mean that on each client the client’s character gets updated daily, but the other players get updated monthly. The game would thus be out of sync, as the player characters would have different cached incomes.
In CK3 we avoid this by just checking that they’re a player rather than the person playing on this machine. Furthermore, we’ve made it deliberately harder to check “is this the local player” than to just check “is this any player”. We still need the former quite a bit (primarily for sending notifications), but it involves the programmer basically going “yes, I’m sure I know what I’m doing here”:

Local_Player_Access.png

[A notification being sent to the local player]

Note the “ALLOW_GET_LOCAL_PLAYER_IN_SCOPE” here; that’s our way of making sure we only check who the local player is if we really need to. Otherwise, we’d easily end up with something only getting changed on a player character for the client actually playing that character.

So that’s the long and short of what out of syncs are, why they happen, and some of what we do to avoid them.
And with that, I’m done. I hope you found this post about how our gamestate works interesting!

I am on vacation today but Matthew (@blackninja9939) will be here to answer any of your questions about this topic as well! And I may check in too!
 
  • 45Like
  • 31
  • 17Love
  • 1
Reactions:
When I read "command system" I had the faint hope that you would finally fix the issue, that denies us to activate and deactivate debug_mode during a running game. (as in console commands) Like in the old PDS games.

But no, we still need mods for that. *sigh*
 
  • 12Like
  • 6
  • 1
Reactions:
When I read "command system" I had the faint hope that you would finally fix the issue, that denies us to activate and deactivate debug_mode during a running game. (as in console commands) Like in the old PDS games.

But no, we still need mods for that. *sigh*
Those things are entirely unrelated and this is a tech post we're not going to be listing anything really new we're giving explanation and info on how the game works.

We've explained the reason behind the debug_mode toggle a bunch before. It is a bit annoying but also seeing as its easy to enable via a startup flag and not something users need by default it is not a high priority to rework our checksumming system to make it work in game compared to doing literally any other feature or fix for the game.
 
  • 12Like
  • 11
  • 9
  • 3
Reactions:
Those things are entirely unrelated and this is a tech post we're not going to be listing anything really new we're giving explanation and info on how the game works.

We've explained the reason behind the debug_mode toggle a bunch before. It is a bit annoying but also seeing as its easy to enable via a startup flag and not something users need by default it is not a high priority to rework our checksumming system to make it work in game compared to doing literally any other feature or fix for the game.
Like I said, I thought "command system" was about console commands initially, that's why I wrote this to express my disappointment when I realized it's not. That's the only reason why I mentioned it here.

I know the reasoning behind it, and I know it's easy to "activate" just not to "re-activate" it once you disabled it ingame, then you gotta restart the game. (if you don't want to play in constant debug_mode with pink debug text everywhere.

But again, I wasn't trying to start a discussion, just to say: "oh I thought command system was about console commands, shame it's not" that's it.
 
  • 9
  • 7Like
Reactions:
When anything differs, that’s an out of sync. At this point, major breakage is inevitable, and so we tell the players and force a rehost. This isn’t a great experience for anyone, so it is something we work hard on avoiding.

Is there a reason you rehost rather than just pause and have the host send everyone a full game state to resync? I guess if things are going off the rails, you'd expect that even after a resync, they'd quickly diverge again?
 
  • 3
  • 1Like
Reactions:
Who is God? What does it mean when God is OOS?
Just the name of a player.


Is there a reason you rehost rather than just pause and have the host send everyone a full game state to resync? I guess if things are going off the rails, you'd expect that even after a resync, they'd quickly diverge again?
The main reason is that it is a feature that has to be built, tested, and maintained.
With OOSes being something we try to minimize, that can be a difficult value proposition, which is why only some of our games have it.
 
  • 9
  • 2Like
Reactions:
Another great read!

As modders, are there things we should be particularly aware of regarding desyncs? Either pitfalls to avoid, or good practice to abide by?

Or can't we cause desyncs anyway because the failsafes code-side prevent us from doing so in the first place?
 
  • 3Like
Reactions:
Another great read!

As modders, are there things we should be particularly aware of regarding desyncs? Either pitfalls to avoid, or good practice to abide by?

Or can't we cause desyncs anyway because the failsafes code-side prevent us from doing so in the first place?
If a mod's causing a desync, it's a bug in the game, not the mod.

If we had any known pitfalls we'd patch them rather than alert you to them. Except an alert if such a patch is a ways off.

There's definitely a non-zero chance there's things mods could do that'd cause a desync that isn't currently done in vanilla (or is much rarer). If a mod is causing a lot of desyncs, that's something we'd love bug reports on.
 
  • 18
  • 1Like
  • 1Love
Reactions:
Again, I can't understand most of this, but it sure is interesting.
 
  • 1
Reactions:
I really love all these dev dairies. I have been coding for a long time and one of my personal fascinations are the whole game engine architecture, all of the tricks to gain more performance, the way multithreading is done. It's also great seeing some code snippets in these posts, because they are very clean with their variable names, it's good to look at :)
 
  • 4Like
  • 1
Reactions:
re: AI and interface
I've been curious, how does AI handle scripted guis? If, say, I add an sgui button to the county view, does AI "see" a button for every county they own? And would it be bad for performance?

I'd really like to see a DD on AI, that sounds interesting.
 
Last edited:
  • 1Like
Reactions:
Honestly a really nice read for someone studying CS hoping to get into gamedev proper (not just modding) at some point. Really hope you guys continue the series for a while as there's so much I wanna ask about it's hard to pick any specific topic ^^
 
  • 2
Reactions:
re: AI and interface
I've been curious, how does AI handle scripted guis? If, say, I add an sgui button to the county view, does AI "see" a button for every county they own? And would it be bad for performance?

I'd really like to see a DD on AI, that sounds interesting.
The AI doesn't know about scripted GUIs unless you make an AI will do section for it.
In which case, it is about equivalent to a decision. Note that what's available is just the AI character; not info from the interface. It's pretty limited.

(This is based on a cursory look at the code; the scripted GUI system is something I have very little experience with so I might be wrong here)
 
  • 6
Reactions:
These posts are pure gold. When I read them and see that beautiful C++ and thnigs like the command system, I wish I had been given the opportunity in life to work on projects like these as my main job instead of just doing C++ as a hobby in my spare time.
 
  • 4Like
  • 1
Reactions:
There's definitely a non-zero chance there's things mods could do that'd cause a desync that isn't currently done in vanilla (or is much rarer). If a mod is causing a lot of desyncs, that's something we'd love bug reports on.
This is very good to know! Is there any particular kind of information that would be useful to provide in an mod game OOS bug report, or do you just want the oos folder logs directly? I imagine that going through what a given desyncing mod actually does would be a herculean task, but would you still want the full mod files (as they were at the time of the desync) included in a bug report?

I recently got some oos logs submitted from my players and identified that the only distinction in the game state was that one province had a supply_limit_mult modifier of 0.25 and the other did not, but there's no information as to what is causing it. As far as I can tell the only static modifier that could achieve this on its own is the vanilla coastal_province_modifier modifier, but that is not utilized anywhere in script - neither in vanilla nor in the mod itself.

I imagine relevant info for a modder to provide is stuff like the fact that this particular province is actually lacking pixels on the map (Its ID was allocated to a region for cartography work but it was never actually used) and so the question of whether it is coastal is actually poorly defined in this particular mod.

View attachment 742645
Once I know what to include, I'll make a proper bug report.
 
  • 1Like
Reactions:
This is very good to know! Is there any particular kind of information that would be useful to provide in an mod game OOS bug report, or do you just want the oos folder logs directly? I imagine that going through what a given desyncing mod actually does would be a herculean task, but would you still want the full mod files (as they were at the time of the desync) included in a bug report?

I recently got some oos logs submitted from my players and identified that the only distinction in the game state was that one province had a supply_limit_mult modifier of 0.25 and the other did not, but there's no information as to what is causing it. As far as I can tell the only static modifier that could achieve this on its own is the vanilla coastal_province_modifier modifier, but that is not utilized anywhere in script - neither in vanilla nor in the mod itself.

I imagine relevant info for a modder to provide is stuff like the fact that this particular province is actually lacking pixels on the map (Its ID was allocated to a region for cartography work but it was never actually used) and so the question of whether it is coastal is actually poorly defined in this particular mod.

View attachment 742645
Once I know what to include, I'll make a proper bug report.
Chances are that we would need to reproduce it internally in order to deal with it.
So any info on how to reproduce it would be helpful.
If the saves only have minor differences there's a chance that can be useful, but most of the time we need to reproduce OOSes internally to deal with them.

We'd want the mod files, yes.

That very specific modifier difference does sound pretty intriguing, and might be enough of a starting point for us to investigate.
 
  • 4
Reactions:
Thanks for the post. Soo interesting! : )
Please do keep posting!)

"We send commands for player and AI interactions"
I wonder why AI interactions are sent over. With so many AI players this seems like a lot of data to transfer.

Given I have a particular game state
And I launch two game instances with the same state
When I execute a particular set of player-commands on each game instance
And next evaluation happens
Will the generated AI commands on one instance differ from those on the other?

In a single player game, I know the game is never the same even if I perform the same set of actions.
I assume there is some randomness involved to spice the game up.
But if a "randomness key" could serve as an additional startup/day input, and all random evaluations are run against that key, one can get the same result.
Or so I think : )
 
  • 2Like
Reactions:
Thank you for the informative post. As a programmer I appreciate your candor.

Using commands is something I remember from a game framework I made nearly 25 years ago when writing code for a Diplomacy (the game) applet.
 
  • 3Like
Reactions: