• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.
I just found a lesser awful modding habit in waste of performance, because the game is missing a native on_action event "on_load_game" (like "on_start_game"). It is just established practice to use a MTTH or daily trigger for this "fire_only _once" event.
 
  • 1Like
Reactions:
As it turns out L3 doesn't mean much. It's all about RAM latency and cpu frequency.

i3-10100 3.6hghz core speed 3.3ghz ring bus JEDEC dual channel 2133 15-15-15-36
6 MiB of L3
Run 1 6 minutes 39 seconds
Run 2 6 minutes 37 seconds
Run 3 6 minutes 37 seconds

i7-10700k 3.6ghz core speed 3.3ghz ring bus JEDEC dual channel 2133 15-15-15-36
16 MiB of L3
run 1 6 minutes 35 seconds
run 2 6 minutes 42 seconds
run 3 6 minutes 32 seconds

i7-10700k stock dual channel 4000 16-15-15-36

run 1 4 minutes 25 seconds

i7-10700k 5.1-5.2ghz core speed (varied) 4.4ghz ring bus dual channel 4000 16-15-15-36

run 1 4 minutes 10 seconds
run 2 4 minutes 13 seconds

Go figure!
 
  • 3
  • 2Like
Reactions:
As it turns out L3 doesn't mean much. It's all about RAM latency and cpu frequency.

i3-10100 3.6hghz core speed 3.3ghz ring bus JEDEC dual channel 2133 15-15-15-36
6 MiB of L3
Run 1 6 minutes 39 seconds
Run 2 6 minutes 37 seconds
Run 3 6 minutes 37 seconds

i7-10700k 3.6ghz core speed 3.3ghz ring bus JEDEC dual channel 2133 15-15-15-36
16 MiB of L3
run 1 6 minutes 35 seconds
run 2 6 minutes 42 seconds
run 3 6 minutes 32 seconds

i7-10700k stock dual channel 4000 16-15-15-36

run 1 4 minutes 25 seconds

i7-10700k 5.1-5.2ghz core speed (varied) 4.4ghz ring bus dual channel 4000 16-15-15-36

run 1 4 minutes 10 seconds
run 2 4 minutes 13 seconds

Go figure!

Oh man you are such a life savior , those data are extremely valuable to me. I have been thinking CPU was the only way to improve the game performance, until now it turns out that RAM speed is also another factors of it. I am gonna plan for better rams and cpus builds just to play stellaris. once again appreciate your effort so much. I might probably wait for the Zen 3 and see how it performs later, if it has better single core performance than Intel gen10th, I might as well go for Zen 3 with better ram kit. Oh god Stellaris , why you are so lag..
 
Last edited:
  • 2Like
  • 1
Reactions:
Oh man you are such a life savior , those data are extremely valuable to me. I have been thinking CPU was the only way to improve the game performance, until now it turns out that RAM speed is also another factors of it. I am gonna plan for better rams and cpus builds just to play stellaris. once again appreciate your effort so much. I might probably wait for the Zen 3 and see how it performs later, if it has better single core performance than Intel gen10th, I might as well go for Zen 3 with better ram kit. Oh god Stellaris , why you are so lag..

Specifically it's memory latency that seems to make the difference. I didn't notice a difference between 2933 and 4400 when latency was normalized. Noticed a pretty huge difference between 60ns and 40ns of memory lag though (15% speed increase in late game) but single core performance still seems to be the biggest factor, unsurprisingly. Zen 3's doubly larger L3 probably won't matter much for stellaris but the single core perf may make a big difference. Whether it will be faster than 10th gen in games... i have some doubts but we will see. Chiplets are still bleeding edge tech and now Intel is also introducing the die to die signaling roadblock into their cpu's starting with rocket lake and for the forseeable future. Monolithic chips like 10th gen may still have the edge in certain areas for a long time, until one company figures out how to address the obscene overhead of sending a signal through a wire, into a solder point, into another wire, where there's probably a repeater (even more overhead!) then... yep another solder point, before reaching the final wire on the destination chip to continue its journey as opposed to a monolithic design that just sends a signal through a copper wire and that's it.

I mean the 10900k Is a monolithic chip but the ring bus is so huge it still has to utilize repeaters and it adds noticeable latency even through a single wire on a die. So you can imagine.

Or we can just keep waiting for Stellaris 2.
 

Attachments

  • 2 - 10900K Core-to-Core.png
    2 - 10900K Core-to-Core.png
    111,5 KB · Views: 0
Last edited:
  • 2Like
  • 1Love
Reactions:
Oh god Stellaris , why you are so lag

Bad engine, bad optimisation or both. Don't spend crazy amounts of money on kit with Stellaris's performance in mind, you'd pay so much for so little gain. I've got an £8k gaming rig, and struggle to push Stellaris past 130fps @ 1440p. (Which goes down to 1fps at the turn of each month late game.) Conversely, I can play the likes of Doom Eternal and Death Stranding at nearly 200fps.
 
  • 7
  • 2Like
Reactions:
Bad engine, bad optimisation or both. Don't spend crazy amounts of money on kit with Stellaris's performance in mind, you'd pay so much for so little gain. I've got an £8k gaming rig, and struggle to push Stellaris past 130fps @ 1440p. (Which goes down to 1fps at the turn of each month late game.) Conversely, I can play the likes of Doom Eternal and Death Stranding at nearly 200fps.

Yeah you don't need to spend crazy amounts to pretty much hit the limit on how well the game will run anyway. A z490 a-pro or asus prime z490-p + scythe fuma 2 + 10600k + a solid kit of samsung b-die RAM (this seems to be the deal of the month in that regard...) and manually overclocking both cpu and ram gives you the most performance you can hope this game to ever have today for only around $600 not including a gpu but most any gpu will do if you optimize settings around it. Spending more than this won't make late game go by significantly quicker unless you are getting into the territory of exotic cooling overclocking but I don't think anyone is desperate enough for faster late game to play stellaris on a liquid nitrogen loop.

$8k is well beyond the sweet spot.
 
Last edited:
  • 1Like
  • 1Love
Reactions:
Yea, i can...disappointedly say... that over the course of the past two months, the performance has not changed for the better... dunno what i expected, really, but disappointed nontheless. In fact i had the feeling, that the games performance got even worse.
It just so incredibly unfun to play the game, because everything is running so slow and once you near the endgame even the god damn autosaves make the game stop responding for a few seconds. So i sat there, steamrolling through the galaxy and even after an hour of playtime i barely progressed more than a couple of ingame years on constant max-speed.
i turned the game off now, because i just couldnt deal with ti anymore.. it was my firstgame ever that lasted until the end-game crisis started, but i just cba to deal with them anymore, because it just runs so slow, that it would have taken several hours realtime relocating my navy to the action and then dealing with it. Not to mention that its also incredibly unfun watchingthe battles unfold, just to have all the ships stop for .5 seconds every two seconds.
No matter how good the game actually could be, it's all destroyed by the limitation to just one cpu core usage
 
  • 3Like
Reactions:
I don't understand why the game doesn't use cores more effectively. Basically most of the calculations done can be parallelized. It should not be a problem to process each planet or empire on a different thread. There is no gain in performance between a r5 and a r9, it is a shame !
The main problem is that the game mechanics are too much interconnected. The job system for instance search a job in all the planets so you cannot process each planet individually. I think optimization is possible and not that hard but it means that the next 6 months will be used on optimization (and possibly AI which is just... terrible also) with no big DLC (a small dlc which add only events for instance can be great and will occupied the content creator).
 
  • 2Like
Reactions:
  • 3Haha
  • 3
  • 1Like
Reactions:
If you didn't read it:

large fleets compound the pop problem. If you wish for the years to pass faster, scrap your fleets. This holds true even if they are imobile in orbit of your starbases. I've added some suggestions and solutions.

 
I don't understand why the game doesn't use cores more effectively. Basically most of the calculations done can be parallelized. It should not be a problem to process each planet or empire on a different thread. There is no gain in performance between a r5 and a r9, it is a shame !
The main problem is that the game mechanics are too much interconnected. The job system for instance search a job in all the planets so you cannot process each planet individually. I think optimization is possible and not that hard but it means that the next 6 months will be used on optimization (and possibly AI which is just... terrible also) with no big DLC (a small dlc which add only events for instance can be great and will occupied the content creator).
From my systems programming knowledge parallelising often hits limits due to lock contention, Linux for example has a number of per-CPU counters so adding up the (rough) total can be lockless. The info about raw memory latency suggests it may be time accessing uncacheable locations which has become a bottle neck, the classic example is when data is stored as objects and methods change/query the object but require a lock to access it. In theory this works wonderfully in parallel but in practice gaining/releasing locks becomes the bottleneck.

The fundamental thing is in Paradox games, everything is a modifier on something, that leads to a whole tree of values being accessed for every "simple" value looked up and they can be changed by empire wide edicts or galactic resolutions.

What you're asking for, likely requires a careful redesign of the data structures with dataflow processing in mind and improving the whole approach to parallelisation is easy to say when thinking conceptually in a simplistic way, but is actually very expensive and very difficult to implement in practice.
 
  • 1Like
Reactions:
Bad engine, bad optimisation or both. Don't spend crazy amounts of money on kit with Stellaris's performance in mind, you'd pay so much for so little gain. I've got an £8k gaming rig, and struggle to push Stellaris past 130fps @ 1440p. (Which goes down to 1fps at the turn of each month late game.) Conversely, I can play the likes of Doom Eternal and Death Stranding at nearly 200fps.
As a matter of interest, why in Stellaris which is not an FPS/TPS or RT action game, do you need more than 60fps? I am asking so I can understand your point of view.

If there's no display update required because at the end of the month, you're zoomed out or looking at a quiet system waiting for the new resource numbers, why is slowing to 1fps a problem? (that's what an efficient graphics driver ought to do, the screen displays an unchanging frame buffer with no cursor movements. Are you watching rapidly moving objects? Because I am not, just the numbers at the top of the screen.

Honestly I actually can't see much Galaxy map benefit from 3D, effectively it's a 2D map; but it's a market acceptance thing - gamers expect and think they need 3D objects. Even changing view for eyecandy doesn't really turn Stellaris into a genuine 3D game as you just select and click on objects and points; then watch the ships go in straight lines to stationary fixed orbits on flat planes. It actually irks me, that the game is so inefficient graphically, wasting energy. Games with actually 3D world modelling allow a 30fps setting for weaker systems and I'd be very happy to use that in Stellaris as it's simply not a rapid reaction twitch game with movement responding to constant inputs, but a game where you queue up orders then watch what happens. For instance in Total War, commands get laggy in older games at HD resolutions making it difficult to give orders. But it was only a gameplay problem at about 1/10th the FPS you are seeing. I would prefer more "friction" changing to system view from galaxy and fighting out more of a realistic battle non-arcade style battle. As is you can pre-order some flanking maneuvre from the neighbouring system when you have multiple fleets engaging; though I confess at times I've had debacles through mis-timing the fleet coordination.

Frankly, if it's to watch big kick ass battles close up, I'm finding the fleet conflicts so lacking in battle strategy compared to Total War, that I often miss them or watch zoomed out, monitoring the battle summary popup. For me Stellaris is less 3D immersive than the 8bit Mosfet 6502 based game Elite, which wowed with 3D wireframe graphics in the early 80's giving a system where you appeared to control your ship through space (but just without accurate physics the ships were controlled more like planes via rolls and banked turns).

This is a grand strategy game, where it's the decisions, not the fine motor control and rapid reactions which make play sucessful or not.
From what I've heard, simple mods in past, helped much late game lag, simply turning pop migration into a monthly decision.
 
  • 1
Reactions:
honestly anything more than 10 fps is tolerable for stellaris, i run it on a six year old computer, it wasn't cheap when it was new but it is is six years out of date, and wasn't originally built as a gaming pc. i am content with performance as it is able to remain at 20-30 fps up until the late game, on a highly outdated computer. the ai being borked is a much bigger concern than performance
 
  • 1Like
Reactions:
As a matter of interest, why in Stellaris which is not an FPS/TPS or RT action game, do you need more than 60fps? I am asking so I can understand your point of view.

If there's no display update required because at the end of the month, you're zoomed out or looking at a quiet system waiting for the new resource numbers, why is slowing to 1fps a problem? (that's what an efficient graphics driver ought to do, the screen displays an unchanging frame buffer with no cursor movements. Are you watching rapidly moving objects? Because I am not, just the numbers at the top of the screen.

This is a grand strategy game, where it's the decisions, not the fine motor control and rapid reactions which make play sucessful or not.
From what I've heard, simple mods in past, helped much late game lag, simply turning pop migration into a monthly decision.

Super high FPS isn't "needed" since I play only singleplayer, it is however still a valid performance metric, and a indication of how the game engine/programming stacks up. I expect much better numbers from a 4 year old game. The slow down to 1fps in the late game is a problem, as the entire game doesn't respond for some seconds, I always need to pause it, otherwise I'll miss the first several days of the next month. This really becomes nightmare, as it also prevents checking/managing of colonies/stations/fleets. If the UIs for those still operated during the pause, it would make it at least tolerable, but it's the whole smash that locks up.

It's been discussed for months now as to why the performance degrades as it does, be it the engine itself, bad coding, or perhaps bad planning (not testing/thinking through the consequences of the shift to pops in 2.0.) it's simply not good enough. It's turning off people from finishing their games, if it goes on like this, it'll get to the point people won't bother turning on the game full stop. More fixes to AI and performance absolutely need to be the devs top 2 priorities when they return from vacation.
 
  • 1
  • 1
Reactions:
I'm not sure that people are arguing about - FPS does affect simulation speed in Stellaris. More FPS means faster game ticks. Whether it should or not is a different can of worms entirely.
 
From my systems programming knowledge parallelising often hits limits due to lock contention, Linux for example has a number of per-CPU counters so adding up the (rough) total can be lockless. The info about raw memory latency suggests it may be time accessing uncacheable locations which has become a bottle neck, the classic example is when data is stored as objects and methods change/query the object but require a lock to access it. In theory this works wonderfully in parallel but in practice gaining/releasing locks becomes the bottleneck.

The fundamental thing is in Paradox games, everything is a modifier on something, that leads to a whole tree of values being accessed for every "simple" value looked up and they can be changed by empire wide edicts or galactic resolutions.

What you're asking for, likely requires a careful redesign of the data structures with dataflow processing in mind and improving the whole approach to parallelisation is easy to say when thinking conceptually in a simplistic way, but is actually very expensive and very difficult to implement in practice.

Any game that models pops and colonies and plays in real time is a dataflow/processing beast and requires careful design.

If stellaris was turn based, it would still command high turn processing times, but that would be a different issue of a different scale. And your next turn processing could be partially done while you are playing your current turn so there's a lot of room there!

What seems to have happened is that that careful design was ok, until they opened pandora's box and exploded population counts and now they don't have any more cheap cards in their hands with the exception of scaling the game back.

Any system that does not abstract population into numbres is doomed to have this problem. They need to have a colony with 1000 pops take the same amount of time as a colony with 1 pop to process. The Idea of going through all the pops on each tick is a dead end. This game is not an fps or RPG, Stellaris by design leads you into exponential or fast pop growth. At best they can make the game calculations be static and only compute changes - this is a more or less complete engine re-design, but would utilize all dataflow bandwidth for changes alone. You would still get lag spikes, when the entire galaxy or a large empire does a nation wide change, but your steady state would be smooth as butter regardless of pop count.

At best, barring any re-design, they can hide such deficiency by making pop growth slower, and the galaxy more lethal.

In Short Victoria 2 population groups were a much better design choice.
 
  • 4Like
Reactions:
Super high FPS isn't "needed" since I play only singleplayer, it is however still a valid performance metric, and a indication of how the game engine/programming stacks up. I expect much better numbers from a 4 year old game. The slow down to 1fps in the late game is a problem, as the entire game doesn't respond for some seconds,
Generally in Stellaris I see 60fps locked to vsync, I tried out Doom and it decided to lock to adaptive 1/2 vsync rate ie 30fps. However I am not going to conclude Doom is less efficient than Stellaris based on that. I did go close and used the 3D view of my ships flying about in late game to see if I noticed any lag or issues .. no. It was 2015 and I had won already. But I was not playing on large galaxy with lots of empires owning big fleets and large planets so I must have stayed below the problem threshold.

As for the comments on late game problems, it appears to me, that the whole game architecture would be flawed, as what the game actually displays is only a local part of the galaxy and one colony or empire figures. Conceptually, the display should be a view on a modelled system with mostly static data that's slowly updating, but some fast flying ships. Seriously 1,000 planets with 1,000 pops is ONLY a million; their desire to migrate and then chose a destination could be entirely decoupled from the real time display .. you are not watching them making decisions in game (you could concentrate on the small number of unemployed pops, as they presumably stand in for millions rather than individuals). The thing is, I do know they have implemented the game, in a highly configurable game engine; if they are tying the simulation of the galaxy, to graphics frames then it's very definitely a fundamental problem but possibly limited to Paradox's game engine capabilities; there's just no reason to check most of that stuff every second, never mind at 60Hz. The thing is the idea is to me so crazy that I cannot think how anyone could implement the game like that, it'd be like Doom or CoD calculating all the movements of every NPC, even though they weren't present on the battle map.

I read somewhere that a mod which changed the pop migration task to run once per month, was a fix pre-2.7; may be you can find an update. There's events in the game with a MTTH, so it's plausible that pop recalc can simply be slowed to a sane rate.
 
Specifically it's memory latency that seems to make the difference. I didn't notice a difference between 2933 and 4400 when latency was normalized. Noticed a pretty huge difference between 60ns and 40ns of memory lag though (15% speed increase in late game) but single core performance still seems to be the biggest factor, unsurprisingly. Zen 3's doubly larger L3 probably won't matter much for stellaris but the single core perf may make a big difference. Whether it will be faster than 10th gen in games... i have some doubts but we will see. Chiplets are still bleeding edge tech and now Intel is also introducing the die to die signaling roadblock into their cpu's starting with rocket lake and for the forseeable future. Monolithic chips like 10th gen may still have the edge in certain areas for a long time, until one company figures out how to address the obscene overhead of sending a signal through a wire, into a solder point, into another wire, where there's probably a repeater (even more overhead!) then... yep another solder point, before reaching the final wire on the destination chip to continue its journey as opposed to a monolithic design that just sends a signal through a copper wire and that's it.

I mean the 10900k Is a monolithic chip but the ring bus is so huge it still has to utilize repeaters and it adds noticeable latency even through a single wire on a die. So you can imagine.

Or we can just keep waiting for Stellaris 2.

The Ryzen mobiles could shed some light on the effects of chiplet vs monolithic die latency since they use a monolithic die. Anandtech compares the latency between a Ryzen 3950X vs a 4900H and its substantal. Comparing a Ryzen 3100 vs a 3300 would also be interesting, one uses two dice (3100) vs one die (3300x).

Relevant links below.


 
  • 1
Reactions: