Hi guys! Today's Diary is going to be a bit of a short one as I am away at a conference (it has free breakfast! Two most magic words!)
Last week we celebrated HOI4’s 3 year anniversary and released 1.7 ‘Hydra’ along with Radio Pack and Axis Armor. I hope you enjoyed them
After the weekend we looked at our telemetry data after 1.7 released and noticed that the amount of multiplayer out of syncs were more common than before. This indicates that we introduced a new OOS problem. While HOI4 has resync and hotjoin its still pretty annoying when you out of sync so we are currently investigating this for a small hotfix patch (1.7.1). Out of syncs can be really tricky to find and nail down so no definite ETA yet on when the patch is ready as we are still hunting. But we acquired a solid lead on the problem just yesterday, and we're currently working out a good solution.
Technical section (warning!): What is an Out of Sync?
For those interested what an out of sync is I figured I’d dig into it a bit. Feel free to skip this if you are a... normal human being, I guess ;D
An Out of Sync (OOS) happens when the host and clients in a multiplayer game start acting differently. This could for example be something like a battle ending in the favor of Germany on one of the computers and in the favor of Soviets on the other. Usually though its nothing that big as the state difference is usually spotted earlier at say one of those units having a 1% higher organization than the other or the like. Once it happens everyone’s experience will very quickly start diverging, so we stop and alert the players. At this point the host can click a ‘Resync’ button to bring the game back into sync. Resyncing will reset the state of the game, send the current host state over as a savegame and have everyone load that, and then things can resume.
So what can cause an OOS? This is where it gets tricky, and its pretty much always a new reason when a problem appears. Some good candidates are multiplayer between different platforms because underlying code libraries can behave differently in some cases. Other common reasons are multithreading. We thread a lot of our code, yet to stay in sync we must assure that events still happen in the same order on all machines in the end if they affect the world. There can also be issues like touching illegal memory or the like that can alter the game state in unplanned ways (or crash… but those are easy to spot and fix!).
Finding and fixing an OOS can be a long process simply because they are often quite rare occurances and it usually takes many steps and iteration to home in on exactly what it is. To find them we run multiplayer tests with QA with special settings that spit out giant log files (which makes everything horribly slow generally) and once and OOS happens we compare log files and savegames to see what differs. This will usually give us an area to start looking at. Lets say for example that you have a unit’s org being different. This could be due to many reasons - battle damage, weather, bad supply etc. So we add more logging to the relevant code areas and do another test. Hopefully this will tell us which of our guesses was right, and we repeat again with more logs and more details for this area. Of course the most fun-to-find OOS errors disappears when you add logging and framerate slows ;P
Some of this can be done automatically over night as well if the problem is unrelated to players, but this is often not the case.
Once found and fixed this is usually the stage we make an open beta patch to verify that it is indeed fixed, or if we deem it sure that we found the problem we will go straight to regular patch. Speaking of beta patches, thanks for the help testing 1.7! we got something like 30k game sessions of testing on the weekend before the final build which was a great help!
Hopefully this little look into some of the technicalities behind working on HOI4 was interesting. If you got questions feel free to ask away!
The part of the team that isn’t trying to solve this OOS has now moved fully on to 1.8 ‘Husky’ and next expansion work, but its early days and its going to be a while until we have things to show off. So this will be the last dev diary in a while as we go into radio silence (and soon glorious socialist swedish summer vacation!) until we have things that are ready to show off.
See you on the other side! And keep an eye on the forum for announcement about the 1.7.1 hotfix when that is ready.
Last week we celebrated HOI4’s 3 year anniversary and released 1.7 ‘Hydra’ along with Radio Pack and Axis Armor. I hope you enjoyed them
After the weekend we looked at our telemetry data after 1.7 released and noticed that the amount of multiplayer out of syncs were more common than before. This indicates that we introduced a new OOS problem. While HOI4 has resync and hotjoin its still pretty annoying when you out of sync so we are currently investigating this for a small hotfix patch (1.7.1). Out of syncs can be really tricky to find and nail down so no definite ETA yet on when the patch is ready as we are still hunting. But we acquired a solid lead on the problem just yesterday, and we're currently working out a good solution.
Technical section (warning!): What is an Out of Sync?
For those interested what an out of sync is I figured I’d dig into it a bit. Feel free to skip this if you are a... normal human being, I guess ;D
An Out of Sync (OOS) happens when the host and clients in a multiplayer game start acting differently. This could for example be something like a battle ending in the favor of Germany on one of the computers and in the favor of Soviets on the other. Usually though its nothing that big as the state difference is usually spotted earlier at say one of those units having a 1% higher organization than the other or the like. Once it happens everyone’s experience will very quickly start diverging, so we stop and alert the players. At this point the host can click a ‘Resync’ button to bring the game back into sync. Resyncing will reset the state of the game, send the current host state over as a savegame and have everyone load that, and then things can resume.
So what can cause an OOS? This is where it gets tricky, and its pretty much always a new reason when a problem appears. Some good candidates are multiplayer between different platforms because underlying code libraries can behave differently in some cases. Other common reasons are multithreading. We thread a lot of our code, yet to stay in sync we must assure that events still happen in the same order on all machines in the end if they affect the world. There can also be issues like touching illegal memory or the like that can alter the game state in unplanned ways (or crash… but those are easy to spot and fix!).
Finding and fixing an OOS can be a long process simply because they are often quite rare occurances and it usually takes many steps and iteration to home in on exactly what it is. To find them we run multiplayer tests with QA with special settings that spit out giant log files (which makes everything horribly slow generally) and once and OOS happens we compare log files and savegames to see what differs. This will usually give us an area to start looking at. Lets say for example that you have a unit’s org being different. This could be due to many reasons - battle damage, weather, bad supply etc. So we add more logging to the relevant code areas and do another test. Hopefully this will tell us which of our guesses was right, and we repeat again with more logs and more details for this area. Of course the most fun-to-find OOS errors disappears when you add logging and framerate slows ;P
Some of this can be done automatically over night as well if the problem is unrelated to players, but this is often not the case.
Once found and fixed this is usually the stage we make an open beta patch to verify that it is indeed fixed, or if we deem it sure that we found the problem we will go straight to regular patch. Speaking of beta patches, thanks for the help testing 1.7! we got something like 30k game sessions of testing on the weekend before the final build which was a great help!
Hopefully this little look into some of the technicalities behind working on HOI4 was interesting. If you got questions feel free to ask away!
The part of the team that isn’t trying to solve this OOS has now moved fully on to 1.8 ‘Husky’ and next expansion work, but its early days and its going to be a while until we have things to show off. So this will be the last dev diary in a while as we go into radio silence (and soon glorious socialist swedish summer vacation!) until we have things that are ready to show off.
See you on the other side! And keep an eye on the forum for announcement about the 1.7.1 hotfix when that is ready.