• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.
The problem is that you run calculations for pops finding new jobs every single day.

Pops don't even produce except once a month! Its also silly that pops can start a job and then the next day have a month worth of production. Seriously, you could eliminate over 96% percent of the calculations used by job searches by eliminating over 96% of the calculations and as far as I can tell, nothing would change.
just run an add for Indeed and watch the lag stop because Indeed is finding them jobs.
 
The problem is that you run calculations for pops finding new jobs every single day.

Pops don't even produce except once a month! Its also silly that pops can start a job and then the next day have a month worth of production. Seriously, you could eliminate over 96% percent of the calculations used by job searches by eliminating over 96% of the calculations and as far as I can tell, nothing would change.
just run an add for Indeed and watch the lag stop because Indeed is finding them jobs.
 
When your pc locks up for 20 mins as it generates 2k pops, it's doing something similar, it's generating 2000 more rows, writing them out and populating them serially (likely by comparing them against your existing species/governing ethics etc. via in-built filters/checks one at a time before it records any specific datapoint for the pop in question)
I'm almost minded at this point of accepting a compromise along the lines of the game "simulating" for the AI what their pops & planets produce. Relative to their number of pops, species traits and game difficulty, with no ai ethic shifts or checks except in one off instances when they take a perk like Cybernetics or Psi. If it means preserving good overall performance to finish a game in exchange.
 
  • 1
Reactions:
I'm almost minded at this point of accepting a compromise along the lines of the game "simulating" for the AI what their pops & planets produce. Relative to their number of pops, species traits and game difficulty, with no ai ethic shifts or checks except in one off instances when they take a perk like Cybernetics or Psi. If it means preserving good overall performance to finish a game in exchange.

And what about multiplayer? Not his won't work unless it's done for all players.
 
I intend to.
bad news: that computer is no longer my main computer. thus my record-setting one-day-per-second at gamestart will have nothing I can compare it to in the lategame.

good news: my new computer is a bit more beefier and holy hell, I can do an entire month in 6 seconds now! At some point I'll be able to test this computer in the lategame and will report back.
 
Has anyone had any performance improvements since the last patch?
Or does anyone have any suggestions short of overclocking my processor, on how to complete a long game ? without the major performance issues that is
 
Has anyone had any performance improvements since the last patch?
Or does anyone have any suggestions short of overclocking my processor, on how to complete a long game ? without the major performance issues that is
Minimize number of pops. Make the population as homogenous as possible once pop count does rise; synth ascension is helpful for that.
 
  • 1Like
Reactions:
Has anyone had any performance improvements since the last patch?
Or does anyone have any suggestions short of overclocking my processor, on how to complete a long game ? without the major performance issues that is
Minimize number of pops. Make the population as homogenous as possible once pop count does rise; synth ascension is helpful for that.
So basically stop playing the game. ;)
 
  • 1
  • 1Like
  • 1
Reactions:
So basically stop playing the game. ;)

Since I play on 1000 stars, 5x habitable, and don't destroy habitable worlds, it does get tough. It is much better though. Between the engine improvements and a couple of in-house mods, I don't see substantial loss of performance until 6000+ pops. I got all the way to 2350 in my last game --- double what I was able to tolerate pre-2.6.
 
And now Paradox is gone for the Summer leaving this broken half alive abomination still in the dark. God I love this company.
 
  • 4Like
  • 3
  • 1Haha
Reactions:
Has anyone had any performance improvements since the last patch?
Or does anyone have any suggestions short of overclocking my processor, on how to complete a long game ? without the major performance issues that is

So for the game I'm playing this weeked I'm doing 3 things:

1. Habitability : Added "Fewer Habitable planets" that's habitability /4, and set it to 0.125 on a 800 star galaxy. Lots of space to grab and lots of resources out there in the stars!
2. Zenith of fallen empires: The plot is to reach a point where you'll have few planets with 75 pops/hedonists/utopia abundance, the planetary slots will be filled with automated buildings, so no jobs and lots of resources.
3. "Gigastructual engineering & more": Get most resources from megastructures that don't use population., and use a few ecumenopoli and ringworld for the rest.

This style involves killing most of the galaxy after you conquer it, by releasing sectors and wiping the pops with colossus, after you build yourself up. So far, I've made it to the point of almost conquering the entire galaxy. It's a nice experience, because mid/late game you have 4-5 colonies only! And because you are playing with 800 stars, there's plenty of space for many kinds of megastructures, (check mods and options to build multiples of them)
 
So for the game I'm playing this weeked I'm doing 3 things:

1. Habitability : Added "Fewer Habitable planets" that's habitability /4, and set it to 0.125 on a 800 star galaxy. Lots of space to grab and lots of resources out there in the stars!
2. Zenith of fallen empires: The plot is to reach a point where you'll have few planets with 75 pops/hedonists/utopia abundance, the planetary slots will be filled with automated buildings, so no jobs and lots of resources.
3. "Gigastructual engineering & more": Get most resources from megastructures that don't use population., and use a few ecumenopoli and ringworld for the rest.

This style involves killing most of the galaxy after you conquer it, by releasing sectors and wiping the pops with colossus, after you build yourself up. So far, I've made it to the point of almost conquering the entire galaxy. It's a nice experience, because mid/late game you have 4-5 colonies only! And because you are playing with 800 stars, there's plenty of space for many kinds of megastructures, (check mods and options to build multiples of them)
with gigastructures the birch world can hold infinite number of pops, perfect for that sort of thing
 
with gigastructures the birch world can hold infinite number of pops, perfect for that sort of thing
No, the approach is to use automated buildings and just unlock the building slots on as few planets as possible, so that's 75 pops per planet +2 or 3 for building safety. On a birch world with very high pop count the game still lags horribly - tested. I may depopulate ringworlds for the same reason. In fact if there was a way to unlock building slots, I'd have just 2-3 pops per planet, but the game or mods doesn't offer such a choice that would be compatible with the AI.
 
I notice no one's actually properly tested the impact of memory in Stellaris, and In the wake of what we can expect to be increases in memory latency across the board with Intel cutting costs by adopting a chiplet i/o next generation, putting them more in line with AMD in that department, I decided to do what I could with my 10th gen cpu to figure out how much memory latency and perhaps bandwidth factors in to Stellaris performance, and maybe get some idea of how much degradation we can expect on future Intel CPU's, if any. Of course, my cpu is a monoblock with an integrated memory controller, so the only way I can enforce memory latency comparable to Ryzen or the upcoming intel 11th Gen is by completely destroying my memory timings.

I went ahead and started up a 1000 star game with 5x habitable worlds, low tech costs, max empires, all advanced starts, and let fast_forward run for awhile. When i got tired of waiting, we ended up almost 200 years into the game. No end game crisis yet, but there was millions of fleet power on the board and total galaxy population was pushing to just over 50k pops, which is of course insane. Safe to say that despite this only being like 184 years into the game, this represents a very intense late game scenario that most people probably won't reach in their average game unless they make a habit of using those stupid galaxy settings I punched in.

I then did 9 test runs, spread across 3 different RAM configurations, stopwatching how long it takes a single year to pass under very controlled conditions, everything kept the same from background programs to where my camera is and how zoomed it is on the galaxy map in oberver. GPU usage was minimal, cpu clock speed was consistently 4.15ghz on avg across all tests. Below are the details on my system, ram configurations, and the results:

intel core i3-10100
nvidia geforce gtx 960 4gb
16gb ddr4 dual channel 8gbx2 samsung b-die


Daily Config: 4400 18-18-18-42 420tRFC CR2

Memory latency 41.9ns
Read 57000 MB/s
Write 64000 MB/s
Copy 51000 MB/s

Tight 2933 Config: 2933 11-11-11-28 234tRFC CR2

Memory latency 42.1ns
Read 44000 MB/s
Write 44000 MB/s
Copy 39000 MB/s

Craptastic 2933 Config: 2933 28-28-28-64 770tRFC CR2

Memory latency ~63.4 NS (roughly equivalent to Zen+ or Matisse)
Read 38000 MB/s
Write 42000 MB/s
Copy 32000 MB/s


Test results one year on Fastest, starting on resume game with spacebar and ending on Day 2 of the following year (to account for missing most of the start of year lag for the initial year since I saved the game after it):


Daily Config - high bandwidth/low latency

test #1 - 5 minutes 34 seconds
test #2 - 5 minutes 35 seconds
test #3 - 5 minutes 35 seconds


Tight 2933 Config - low bandwidth/low latency

test #1 5 minutes 35 seconds
test #2 5 minutes 39 seconds
test #3 5 minutes 36 seconds


Craptastic 2933 config - low bandwidth/high latency

test #1 6 minutes 14 seconds
test #2 6 minutes 15 seconds
test #3 6 minutes 12 seconds


Conclusion: Stellaris is sensitive to memory latency, and introducing a large amount of it through timings that are much looser than you'd ever actually run introduces substantial performance degradation. Memory bandwidth matters much less, if at all, past a certain point, as there was no significant performance difference between my daily 4400 and tight 2933 configurations in spite of a large difference in read/write speeds; DDR5 is not likely to offer any gains in Stellaris, and 10th gen Intel MIGHT be better than 11th gen Intel, although that is a big assumption that will have to wait until I actually have an 11th gen chip in my system to test.

However, this ultimately may not be a good indicator of Ryzen performance difference vs Skylake. AMD has designed Zen with higher memory latency in mind, and it copes with this with this limitation with features like a larger cache subsystem, a wider reorder buffer, and a massively larger uOP cache than Skylake's, among other things. Worse memory latency still hurts, but I suspect it hurts a lot less for AMD than it does for Intel (which makes Intel's upcoming 11th gen's inclusion of a chiplet i/o but favoring Sunny Cove's relatively anemic l2/l3 cache configuration compared to Zen 3 puzzling, when they should have gone with Willow Cove's. They either managed not to incur a large latency penalty with their desktop EMIB/Foveros chiplet implementation, or they simply don't care of AMD destroys them in games with Zen 3.)

Anyway, if anyone wants to run their own tests on this save, I have included it in the attached file. It is vanilla, only utopia and horizon signal are needed to run the save, but i conducted my tests with all DLC active... I would be interested to see some results with something like a Ryzen 3 3300x.
 

Attachments

  • godwhy.sav
    7,9 MB · Views: 0
Last edited:
  • 5
  • 3Like
Reactions:
My major performance complaint isn't the overall performance, but the specific performance when I open a colony view window.

Somewhere between 2375 and 2425, when I open one of my colony views (AI colony views do not have this problem) the game starts taking a second or more to respond to mouse clicks. Is this normal, or are there settings I should change to get it back to behaving like a sensible UI?
This is exactly the problem I run into, specifically when paused. It seems like there shouldn't be any massive additional processing going on while in the colony screen when paused than when the game is running, nor in the colony screen v. everywhere else in the game while it is paused. Ship design, fleet management, diplomacy, and planning fleet movements works just fine while paused, it is only the colony screen that refuses to work at speed.
 
@Riince

The memmory testing is somewhat flawed. You can't check the impact given the fact that the game is cpu/single core bound on processing pops.

Only once the bottleneck is resolved you will be able to measure memory characteristics and have some sensitivity to produce a proper result.

If they optimize pop/modifier calculations further into many cores, things will speed up almost multiplicatively, as using more cores would also mean using more cache. Testing the same data set would even require less memory operators, just because of that. So it's like 1+1 = 4.

Multichannel capability also plays a role, but it's even more difficult to test such things.
 
  • 1
  • 1
  • 1
Reactions:
So to improve performance - Less Pops is the Key?

So to complete a game, Less colonized planets, less pops, smaller galaxy, and maybe I can finish a game to the end without extreme lag?
In my current game the days almost seem like they are clicking normally, but the fleets and movement lag / glitch, its like they made the days look normal even though the whole thing is skipping.

It also sounds like one of the major issues is that the AI spams stupid orbital habitats, and fills those with pops?
Anyone know of any decent Mods that help improve performance by disabling Pop Ethos / Orbital Habitats?

Is it specifically the Ethos / Pop traits that cause the issue or is it basically everything to do with Pops in general, Jobs, food, happiness?
Maybe they need to streamline pops to not be such individuals but general based on the planet.
 
  • 3Like
Reactions:
@Riince

The memmory testing is somewhat flawed. You can't check the impact given the fact that the game is cpu/single core bound on processing pops.

Only once the bottleneck is resolved you will be able to measure memory characteristics and have some sensitivity to produce a proper result.

If they optimize pop/modifier calculations further into many cores, things will speed up almost multiplicatively, as using more cores would also mean using more cache. Testing the same data set would even require less memory operators, just because of that. So it's like 1+1 = 4.

Multichannel capability also plays a role, but it's even more difficult to test such things.

I disagree. With something like Stellaris, once the cache is full of looping instructions due to the sheer scale of the game's simulation, every single cache miss requires instructions that have previously dumped into memory to be called again for a later instruction, maybe only a few hundreds or thousands of cycles later. This is especially prevelant in a game like stellaris where so many things are dependent on each other. Every time this happens, the game simply has to wait x nanoseconds (in my case, 41) before any work can be performed on the main game thread, the game is essentially stopped during this time. Doesn't seem like long to wait, but as my results show it adds up, and it feels like a long time to a processor who goes through stages in its instruction pipeline in the order of 500 to 1000 picoseconds. My point is this: regardless of how many cores the game effectively uses, assuming a normalized cache hit rate, memory latency will not improve. It may even degrade... Of course, this is more than offset by the gains of parallel processing, if the game could fully utilize 20 threads it would run like an absolute dream. Even though it doesn't, I would venture to guess the i9-10900k would still outperform my cpu measurably just by virtue of having over 3x as much LLC, even if clocked down to the same speed as my i3 and ran with the same memory, although the small latency penalty introduced by the subsantially larger ring bus may offset this benefit a bit, i still expect 20MiB of LLC to represent a net gain in spite of it.

However, there is a situation where what you say isn't entirely wrong, and that would be on a Ryzen cpu, but not because of the DRAM. Because a dirty cache line being evicted and read into the other core has to go through the IF and be dumped into system memory first, trying to access cache on a different core complex is actually slower than system memory, AMD knows this of course, so those 16mb slices are exclusive to that particular complex of 4 cores and are not shared with cores on a different complex; you should see performance uplift and much fewer cache misses if you could evenly split the game's threads across different CCX's... This runs contrary to Intel's Smart Cache where any one core can utilize the L3 slice of any and all other cores present on the CPU. You would only see the uplift of multi-threading if you applied the same parallelization to an Intel cpu, as well as more hits from the core's individual L1 D/I-cache/L2, which is exclusive... but also significantly smaller. I believe this plays in no small part into why AMD sees such higher numbers than Intel in multithreaded workloads over Intel, a thing that seems paradoxical at first when you see AMD has lower single core performance. The architectures scale much differently.

I got off on a tangent... my bad. I'd still be interested in 3300x/3600/3700x etc. performance results with my save... I believe the 3300x is the best performing cpu AMD has for stellaris strictly speaking because core-to-core latency is irrelevant, it is the same as Intel's: Around 20ns for every core on the cpu. Others might see a tiny loss in performance due to intra-ccx latency which can be as high as 68-70ns, even though inter-ccx core-to-core is the same on anything higher than a 3300x, Stellaris isn't going to consider physical CPU topology when it is assigning a logical processor priority to its many operating threads. At least... I don't think it will? I could be wrong. I'm also not considering the possible benefits of secondary game threads being able to operate in a different slice of LLC and what benefit that might have for the primary game thread. Maybe I am wrong... the 3950x could easily be the best mainstream AMD cpu for stellaris. If anyone reading this happens to own that monster of a cpu, please give it a try.
 
Last edited:
  • 1
Reactions: