• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #181 : Threading and Loading Times

Hello everyone, this is The French Paradox speaking!

On behalf of the whole Stellaris team, we hope you've had a good summer vacation, with current circumstances and all!

We're all back to work, although not at the office yet. It is going to be a very exciting autumn and winter with a lot of interesting news! We are incredibly excited to be able to share the news with you over the coming weeks and months!

Today I open the first look at the upcoming 2.8 release with some of the technical stuff that we programmers have been working on over summer. The rest of the team will reveal more about the upcoming content and features in the following diaries.

Without further ado, let's talk about threads!

Threads? What threads?

There is a running joke that says fans are always wondering which one will come first: Victoria III or a PDS game using more than one thread.

image (26).png

Don't lie, I know that's how some of you think our big decision meetings go

I’m afraid I’ll have to dispel the myth (again): all PDS games in production today use threads, from EU4 to CK3. Even Stellaris! To better explain the meme and where it comes from, we have to go through a little history. I’m told you guys like history.

For a long time, the software industry relied on “Moore’s Law”, which states that a CPU built in two years will be roughly twice as efficient as one today.
This was especially true in the 90s, when CPUs went from 50 MHz to 1GHz in the span of a decade. The trend continued until 2005 when we reached up to 3.8GHz. And then the clock speed stopped growing. In the 15 years since, the frequency of CPUs has stayed roughly the same.
As it turns out, the laws of physics make it quite inefficient to increase speeds beyond 3-4 GHz. So instead manufacturers went in another direction and started “splitting” their CPUs into several cores and hardware threads. This is why today you’ll look at how many cores your CPU has and won’t spend much time checking the frequency. Moore’s Law is still valid, but, to put it in strategy terms, the CPU industry reached a soft cap while trying to play tall so they changed the meta and started playing wide.

This shift profoundly changed the software industry, as writing code that will run faster on a CPU with a higher speed is trivial: most code will naturally do just that. But making usage of threads and cores is another story. Programs do not magically “split” their work in 2, 4 or 8 to be able to run on several cores simultaneously, it’s up to us programmers to design around that.

Threading nowhere faster

Which brings us back to our games and a concern we keep reading on the forums: “is the game using threads?”. The answer is yes, of course! In fact, we use them so much that we had a critical issue a few releases back where the game would not start on machines with 2 cores or less.

But I suspect the real question is : “are you making efficient usage of threads?”. Then the answer is “it depends”. As I mentioned previously, making efficient use of more cores is a much more complex issue than making use of more clock cycles. In our case, there are two main challenges to overcome when distributing work among threads: sequencing and ordering.

Sequencing issues occur when 2 computations running simultaneously need to access the same data. For example let’s say we are computing the production of 2 pops: a Prikki-Ti and a Blorg. They both access the current energy stockpile, add their energy production to it and write the value back. Depending on the sequence, they could both read the initial value (say 100), add their production (say 12 and 3, the Blorg was having a bad day) and write back. Ideally we want to end up with 115 (100 + 12 + 3). But potentially both would read 100, then compute and overwrite each other ending up with 112 or 103.
The simple way around it is to introduce locks: the Prikki-Ti would “lock” the energy value until it’s done with its computation and has written the new value back, then the Blog would take its turn and add his own. While this solves the problem, it introduces a greater one: the actions are now sequential again, and the benefit of doing them on concurrent threads has been lost. Worse, due to the cost of locking, unlocking and synchronizing, the whole thing will likely take longer than if we simply computed both on the same thread in the first place.

The second issue is ordering, or “order dependency”. Meaning in some cases changing the order of operations changes the outcome. For example let’s say our previous Prikki-Ti and Blorg decide to resolve a dispute in a friendly manner. We know the combat system will process both combatants, but since there are potentially hundreds of combat actions happening, we don’t know which one will happen first. And potentially on 2 different machines the order will differ. For example on the server the Prikki-Ti action will happen first, while on the client the Blorg will act first.

OOS.png

#BlorgShotFirst

On the server the Prikki-Ti action is resolved first, killing the Blorg. The Blorg action that comes after (possibly on another thread) is discarded as dead Blorgs can’t shoot (it’s a scientific fact). The client however distributed the computation in another way (maybe it has more cores than the server) and in his world the Blorg dispatched the Prikki-Ti first, which in turn couldn’t fight back. Then both players get the dreaded “Player is Out of Sync” popup as their realities have diverged.

There are, of course, ways to solve the problem, but they usually require redoing the design in a way that satisfies both constraints. For example in our first case each thread could store the production output of each pop to add to each empire, and then those could be consolidated at the end. In the same fashion our 2 duelists problem could be solved by recording damage immediately, but applying the effects in another phase to eliminate the need for a deterministic order.

As you can imagine, it is much easier to design something with threading in mind rather than retrofitting an existing system for it. If you don’t believe me just look at how much time is spent retrofitting your fleets, I’ll wait.

The good news

This is all nice and good, but what’s in it for you in the next patch, concretely? Well you will be happy to hear that I used some time to apply this to one of the oldest bits of our engine: the files and assets loading system.

For the longest time we have used a 3rd party software to handle this. While it saved us a lot of trouble, it has also turned out to be quite bad at threading. Up to the point that it was sometimes slower with more cores than less, most notably to the locking issues I mentioned before.
In conjunction with a few other optimizations, it has enabled us to drastically reduce the startup time of the game.
I could spend another thousand word explaining why, but I think this video will speak better:


This comparison was done on my home PC, which uses a venerable i7 2600K and an SSD drive. Both were “hot” startups (the game had been launched recently), but in my experiments I found that even on a “cold” start it makes a serious difference.

To achieve the best speedup, you will need to use the new beta DirectX11 rendering engine. Yes, you read correctly: the next patch will also offer an open beta which replaces the old DX9 renderer by a more recent DX11 version that was initially made by our friends at Tantalus for the console edition of Stellaris. While visually identical, using DX11 to render graphics enables a whole range of multi-threading optimizations that are hard or impossible to achieve with DX9. Playing with the old renderer will still net you some nice speedup on startup, the splash screen step should still be much faster, but you’re unlikely to see the progress bar “jump” as it does with DX11 when the game loads the models and textures.

Some of those optimizations have also been applied to newer versions of Clausewitz, and will be part of CK3 on release. Imperator should also benefit from it. It might be possible to also apply it to EU4 and HoI4, but so far my experiments with EU4 haven’t shown a huge speedup like it did for Stellaris and CK3.

If you want to read more technical details about the optimizations that were applied to speedup Stellaris, you can check out the article I recently published on my blog.

And with that I will leave you for now. This will likely be my last dev diary on Stellaris, as next month I will be moving teams to lead the HoI4 programmers. You can consider those optimizations my farewell gift.
This may have been a short time for me on Stellaris but don’t worry: even if I go, Jeff will still be there for you!
 
Last edited:
  • 145Like
  • 38Love
  • 24
  • 6
  • 5Haha
  • 4
Reactions:
Great to see dev diaries back, I really missed them! I'd love to hear more about various behind the scenes and under the hood work you guys are doing when you don't have any teasers for what's actually going to be in the next update or DLC. Even if it's not something which you feel is very important or interesting, many of us are dying for engagement.
 
  • 4
Reactions:
Well, after so much time in the Stellaris forum i will only say that i will wait for this to be released to start praying the next bug patch after it doesn't ruin things :(
 
Fine, based on the video, Stellaris will start roughly half a real-time-minute faster. Maybe even a whole real-time-minute in some cases ?

I promise that I will be more impressed if the game itself would run faster during its actual play-time so that I wouldn't waste a zillion of real-time-HOURS due to ingame-performance-issues.

While your point about the game running slowly is valid, this isn't the entire update, and I seriously doubt they would optimise threading solely for loading times. Not to mention, they specifically mentioned they were going to be working on other areas of performance.


Have I missed something, but ( altough 3 whole months have already passed again ) this is all for now, right ? So, let me guess that the 3 whole months were actually used for something else ... like new content ... for a new DLC and its update ?

As for this point... No. They've been on their summer holidays. They probably really needed it after putting up with us for the past three quarters of a year since their previous one. The past three months they've been relaxing and getting over the trauma of dealing with the fanatical purifiers (us lot of xenocidal monsters) banging on the office doors all year.
 
  • 7
  • 2Like
  • 1
Reactions:
Will there be any change to the multiplayer load times? Usually it's more than 5 minutes wasted waiting for file transfer while other games do in less than a minute even.
 
  • 2
Reactions:
First up: Thanks for dealing with the loading times. At least for me, this really was an issue that was bugging me.
Second, I'd like to make a humble request that you maybe do a release that beefs up the stuff that already is in the game, instead of adding vast new content. At least for me 2.7 feels like the buggiest version ever and I've never thought so often "why did they add this feature without giving me the option to do X". Megacorps for example: Why can't they release non-megacorp subsidiaries? (btw: They don't expand as intended atm, either) Why don't they have a CB "Enforce market access". Federations: Why are your vassals allowed to vote differently when in the same federation? Why is there no way to sway other empires to leave their federation and join yours? Why is there no similar stance to "assimilation" for the genetic ascendency? That's micro management hell! Stuff like that. At the moment Stellaris feels very wide, but very thin.
 
  • 2Like
Reactions:
Hello everyone, this is The French Paradox speaking!
...

Hello, I have just read both the forum post and your blog entry and am quite intrigued by the performance enhancement that you have managed to achieve in Stellaris. You have stated that EU4's load up times were negligibly affected, but I have to ask if it is possible to you to run the modified PhysFS program with the MEIOU and Taxes mod, or possibly make the tools public so that we ourselves can test it! The main burden that the mod adds is CPU load, so I am hoping that it may have a similar effect to your work on Stellaris.

Cheers
 
You need to observe a specific AI nation for the skull tooltips to show.

AI (and performance for that matter) are always things we keep in mind. The next patch will contain some tweaks on that topic, as the ones after that too. It's a recurring topic that we work on over time, it's unlikely that there will be the _one_ patch that solves everything at once.

I'm sorry to be the bearer of bad news, the performance problem won't be solved with incremental improvements, you either attack the problem at it's core and solve it or go around it by changing the design of the game. So it looks like that from what you're writiing here, the performance thread will reach 140 pages or more in the next 5 years (now sits at 71 pages). Thanks but no thanks, can't be bothered...

As per the DD and the basics of threading are concerned, here's a free tip for you guys: have you considered having a local planetary buffer/stockpile? In that way you don't have to lock on the global empire stockpile for each pop, and you can parallelize all planets on n threads, with no lock contention. That stockpile should hold all types of empire/game resources, including influence, amenities, police points, etc. You then consolidate all planetary stockpiles at the end of your processing with the global empire stockpiles serialy and you can even apply rules on how this is done, including adding depth to the game, with resource transfer capacity, logistics and such, if you wish. This way you remove most pressure from the pop count, and move it one layer up into the colony count. And guess what you can repeat this with sectors - oh my god, what an elegant solution this is - doing hierarchical multithreading.

Planetary stockpiles would also be a great place to start for an espionage dlc, cause you need to have local stocpiles to do covert operations and sabotage.

Having pops lock global resources/values is a retarded/morronic design and coding decision, but the team that did that was working originally on a different and simplified specification and not the *thing* that came out to be known as the megacorp population system. So of course I mean no offense to them.

The way we deal with it right now is to just *stop* your code from running, and instead do it in a controlled manner, along with several other harsh design decisions - i.e. we moved past you guys - in a manner of speaking.

The leading mods for that are:


and


sitting at 26k and 19k subscribers respectively. With about 12k players right now in game I'd assume about 50% are using them - but I could be wrong. I assure you, without modding this game would have been dead years ago. I use over 40 mods just to have a playable vanilla experience with very little bugs and a semblance of balance and performance. I dare PDX to make vanilla playable and enjoyable with zero mods.

As it stands, I never saw any of your customers complaining about loading times and even if the game took double the amount of time to load right now, without the beta patch I wouldn't care at all. What most of your players care about is how slow it plays near mid game/end game - and forum admins had to even isolate all those posts to a megathread, cause it was all over the place.

I hope PDX devs will use this "extra loading time speedup" to fix the actual isses and bugs people complain about and not just use it as a vehicle to dump more DLCs on us to increase loading times again - it won't work like this.

[edit: spelling]
 
Last edited:
  • 10
  • 9
  • 3Like
Reactions:
Good to see a devdiary and good to see the perf improvement. In dev perspective I totally understand that.

BUT

This is not a community priority and on these points :

  1. Pop growth, (automatic) migration, resettlement simulator lategame, pop micromanagement hell and gene modding problems, having to use an edict to fix pop micromanagement which is locked behind Galactic community and 3 resolutions.
  2. AI being as stupid as ever while Starnet AI, being developed by 1 modder with less tools than Paradox, made an AI fully capable of managing its economy with 0 economy cheats.
  3. Automation as in planet and habitat auto-build. Its completely broken and just as stupid as the AI. This feature will kill your economy instead of making lategame bearable. There are no working tools to have the AI reasonably automate the tedious micromanagement for you.
  4. Crisis AI is broken since 2.2 since the Crisis is too stupid to actually conquer the galaxy. Even worse, the 2.6 patch did focus on Crisis, but Paradox didn't care about fixing whats wrong. Instead they just changed how Crisis targets a player. No one asked for this - instead players asked Paradox to fix the Crisis AI. But Paradox didn't care.
  5. The game was never designed with colonizing 50, 75, 100+ planet and Habitats. Paradox had to add a x25 Crisis multiplier because of this. (Of course it doesn't help at all because Crisis is broken, see 4.) Quest rewards and science from station was never adjusted for 2.2 science costs and science from station and rewards are quite close to useless in most instances.
  6. Balance is completely absent. As a quick example: Spiritualists are no match for Materialists, since 2.2. The very recent update was a huge slap in the face for Spiritualists and made them much worse since most Edicts are now permanent -rendering edict cost and edict duration bonuses from Spiritualist Ethic useless. (Makes you facepalm immediately if you think about it, but apparently Paradox did it anyway)

We still have nothing and it's a major problem. Certain of these are not to crazy to fix ... But we don't even have a "WIP" on this or that. No short term vision on the game or the balance. This is a pity.
 
  • 6
  • 4Like
Reactions:
As it stands, I never saw any of your customers complaining about loading times and even if the game took double the amount of time to load right now, without the beta patch I wouldn't care at all. What most of your players care about is how slow it plays near mid game/end game - and forum admins had to even isolate all those posts to a megathread, cause it was all over the place.

The complaints about loading times aren't as dire as the game performance ones, but i've definitely seem them, they often crop up in the tech support forum.

It was also quite annoying for multiplayer games because everyone had to show up 10+ minutes early just to make sure everyone had their game loaded up.
 
OK, so my main fleet is running towards the other end of my space to merge with the single transporter there. OK, that is still dumb but it's not randomly dumb. I abandoned this game months ago and finally have an answer.
I feel like transport handling is probably the weakest bit of the fleet AI today. I have a few beta changes that improve objective selection (avoiding sending fleets on long trips) but it's a bit too early to present and it may not address entirely address that exact issue. That's the trick with AI, while reasonably feasible to make it behave better in one given test case, you always run the risk of making it worse in all the others.
i was wondering if you could make an auto threading script that applies to mods
I'm afraid that's not how it works. If I could make a script that magically transforms some game logic into something thread friendly, I could probably put a bunch of programmers out of a job ;)
it would be cool to see higher interaction between the devteam and modders giving them access to performance metrics. some dos and donts
After playing with both EU4 and Stellaris loading benchmarks, I can give you one right now: try avoiding have lots of small files (like EU4 does with one text file per province). Windows is just terrible at loading those efficiently. My first idea was to "compile" the game text files into one big archive after the first load to speed up subsequent startups, but it would have taken some time to implement safely and wouldn't have benefited Stellaris as much as the other changes I made.
I still keep it in mind for later, but as you may guess there's always more things we could do than time to actually do them.
 
  • 20
  • 1Like
  • 1
Reactions:
While your point about the game running slowly is valid, this isn't the entire update, and I seriously doubt they would optimise threading solely for loading times.
But that're basically the results of this dd. Anything else is speculation. An improvement of -15, -30 or even -45 seconds ( of loading-times ) would be nice and dandy if the game has to deal with several situations like that ( like in Total-War-games when such a game switches between the campaign-map and battles ), but in Stellaris it's nearly useless once you're in your session.

Not to mention, they specifically mentioned they were going to be working on other areas of performance.
Sorry, but unlike results, intentions don't impress me anymore.
 
  • 10
  • 1Like
Reactions:
I'm happy to see the dev diaries returning!

Honestly, the increased loading times don't really excite me. The game takes ~1 minute to launch and around 10-15 seconds to load a save, so not something I ever even thought needed to be worked on. Seems long when I type this out but in reality it doesn't seem unreasonable considering that once you're in the game, there's no more loading. I won't refuse the improvements, but man do I feel like the actual game performance is what needs improving.

But to be more positive, I am anxiously looking forward to whatever DLC and other features is going to be worked on next. As someone who prefers the exploration and discovery aspects of Stellaris more than anything else, I really hope to see another Ancient Relics/Distant Stars type of addition to the game.
 
first we developers restart the game a _lot_ during development (think something like 20+ times a day) so that's a huge productivity improvement for us.
Nice.

So can some math whiz calculate now how fast patches come out since there's been a lot time saved now that restarting the game is 30 secs instead of more than a minute?

I am sure someone can figure it out!
 
guys this is lovely and all, but comparing loading times to the game's actual problems is like comparing a brown dwarf with a supermassive quasar. seriously, prioritize.
 
  • 11Like
  • 1
Reactions: