• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.
I think your gpu could as well pose a flaw- I used a gtx 980 on my old pc and I had quite often a 100% utilization of it. Of course this highly depends on your graphics settings, especially the resolution, but a 960 is quite weaker than a 980 with much less vram.
Additionally you mentioned the cache problems, which may indeed be interesting to look into. I think it would be quite interesting to benchmark Stellaris with a 3900xt & a 10900k (at max overclock each and at the same clockspeed to show the difference) together with a 2080ti and differend kinds of RAM Settings. Then you could see if the latency etc. is that much of a problem. Sadly these kinds of tests are usually not done for games like Stellaris, but I have seen benchmarks for many games which basically say that RAM beyond 3600mhz cli16 does not matter (+-1% performance).
I am faily optimistic though that we will see quite an improvement when DDR5 RAM is finally released and both intel and amd are at 7nm & 5nm respectively.
 
I think your gpu could as well pose a flaw- I used a gtx 980 on my old pc and I had quite often a 100% utilization of it. Of course this highly depends on your graphics settings, especially the resolution, but a 960 is quite weaker than a 980 with much less vram.
Additionally you mentioned the cache problems, which may indeed be interesting to look into. I think it would be quite interesting to benchmark Stellaris with a 3900xt & a 10900k (at max overclock each and at the same clockspeed to show the difference) together with a 2080ti and differend kinds of RAM Settings. Then you could see if the latency etc. is that much of a problem. Sadly these kinds of tests are usually not done for games like Stellaris, but I have seen benchmarks for many games which basically say that RAM beyond 3600mhz cli16 does not matter (+-1% performance).
I am faily optimistic though that we will see quite an improvement when DDR5 RAM is finally released and both intel and amd are at 7nm & 5nm respectively.

i eliminated this possibility with low resolution and low settings. My gpu ran at around 20-30% utilization tops throughout the test, usually much lower.

You aren't going to find stellaris performance benchmarks anywhere. Even if you do, they will be testing FPS - which is not a useful metric for how this game runs. I have not seen any mainstream techtuber or hardware review site use rate of months in a late-game save on Fastest speed, or really anything but FPS in any game minus the occassional inclusion of Civ turn time. This is difficult to control, only tangentially related to FPS (as in, if your fps is too low, it potentially CAN limit your ticks), but it also is one of the reasons I'm so interested in testing it. Hopefully some standard gets set for testing Stellaris performance at some point, but since old saves are constantly invalidated by new patches, that's not very likely. PDX would have to make some specific game scenario for benchmarking that you can fire up from the menu regardless of game version, one that never changes.

Again, I see no reason for DDR5 to offer a performance uplift, memory access times do not really improve with new generations. More often they even take a slight hit. It will have a lot more bandwidth for sure, but that won't help Stellaris, at least not over high-clocked DDR4. We will see gains from future cpu's, but it will be due to architecture improvements, not DDR5. Granted, the more you improve CPU throughput, the more memory bandwidth you need to feed the core, so DDR4 would probably encounter bandwidth issues even in a single core Stellaris if you could somehow run it on on a cpu released 10 years from now on DDR6... but that's neither here nor there really. basically DDR4 is fast enough for stellaris, to the point where a 50% ipc gain over skylake on the upcoming alder lake golden cove cores will offer... at most 50% more stellaris performance. Nothing additional to be found from the newer DRAM it will ship with support for, I'm afraid.
 
Last edited:
  • 1
  • 1Like
Reactions:
@Riince2 You may be right, sure it depends on cache layout/architecture, but at this stage we're so low level that it doesn't make much sense thinking about it, let alone writting code about it as a developer. You'd code it for the most general case and I bet PDX coders don't even think these things - not enough time or enrgy to worry about it!

Another thing that we forget about multicore is that most chips are memory starved. Sure, you write your software to make use of 4-8 threads, but can the hardware pump all of that data in and out? Most probably not. Hence why multithreading is recomended for low memory/computationaly heavy algorithms, that also parallelize naturaly. This thing the devs speak about one modifier affecting everything - it's evil. They need to deal with modifiers in a structured and hierarchical way, where the ones that are affecting everything are resolved and held in the beginning and are also shared as constants to many cores.

So not to go off a tangent here as well, it's evident that the problem of calculating pops MUST be simplified. it's imposible to solve this for commodity and heterogeneous motherboards, ram, and CPUs.. Abandon all hope and run to the mountains - And we haven't even spoken about console hardware here!
 
I think your gpu could as well pose a flaw- I used a gtx 980 on my old pc and I had quite often a 100% utilization of it. Of course this highly depends on your graphics settings, especially the resolution, but a 960 is quite weaker than a 980 with much less vram.
Additionally you mentioned the cache problems, which may indeed be interesting to look into. I think it would be quite interesting to benchmark Stellaris with a 3900xt & a 10900k (at max overclock each and at the same clockspeed to show the difference) together with a 2080ti and differend kinds of RAM Settings. Then you could see if the latency etc. is that much of a problem. Sadly these kinds of tests are usually not done for games like Stellaris, but I have seen benchmarks for many games which basically say that RAM beyond 3600mhz cli16 does not matter (+-1% performance).
I am faily optimistic though that we will see quite an improvement when DDR5 RAM is finally released and both intel and amd are at 7nm & 5nm respectively.

I use a 980m and have no problems with framerate whatsoever, even at 1000 stars. The issue is always many pops and the passage/calculation of days.

I also can't wait for DDR5, because we're historically always restrained by memory throughput, and I would like to see some tests with the new ryzen chips!
 
Conclusion: Stellaris is sensitive to memory latency, and introducing a large amount of it through timings that are much looser than you'd ever actually run introduces substantial performance degradation. Memory bandwidth matters much less, if at all, past a certain point, as there was no significant performance difference between my daily 4400 and tight 2933 configurations in spite of a large difference in read/write speeds; DDR5 is not likely to offer any gains in Stellaris, and 10th gen Intel MIGHT be better than 11th gen Intel, although that is a big assumption that will have to wait until I actually have an 11th gen chip in my system to test.

Many thanks for doing this. If you'd asked me to guess if Stellaris is more likely to be latency limited than bandwidth limited I would have guessed latency limited but it's good to know for sure.

Side note: it's possible that a particularly large L3 cache will be highly beneficial for the current Stellaris. This is just a guess though. Not so easy to test this sort of thing though, unfortunately.


However, this ultimately may not be a good indicator of Ryzen performance difference vs Skylake. AMD has designed Zen with higher memory latency in mind, and it copes with this with this limitation with features like a larger cache subsystem, a wider reorder buffer, and a massively larger uOP cache than Skylake's, among other things. Worse memory latency still hurts, but I suspect it hurts a lot less for AMD than it does for Intel (which makes Intel's upcoming 11th gen's inclusion of a chiplet i/o but favoring Sunny Cove's relatively anemic l2/l3 cache configuration compared to Zen 3 puzzling, when they should have gone with Willow Cove's. They either managed not to incur a large latency penalty with their desktop EMIB/Foveros chiplet implementation, or they simply don't care of AMD destroys them in games with Zen 3.)

If you wanted an AMD system specifically for Stellaris then the almost released "desktop Renoir" processors might work out better than similarly priced alternatives. They're basically a CPU and GPU (and I/O) combined on one chip and since Stellaris isn't particularly GPU intensive this should be good enough. They go in the standard AM4 desktop socket. Since the memory controller is on the same die as the CPU the memory latency is lower than with chiplets. Makes for nice, simple, low power, relatively cheap desktop systems with a lot of brute CPU power. Plenty of upgrade paths too, eg adding dedicated graphics card later if necessary.


PS Agree with what others have said: If Stellaris was properly multi-threaded it would almost certainly become more memory bandwidth limited than memory latency limited. But that's not something we need to worry about right now.
 
  • 1
Reactions:
Many thanks for doing this. If you'd asked me to guess if Stellaris is more likely to be latency limited than bandwidth limited I would have guessed latency limited but it's good to know for sure.

Side note: it's possible that a particularly large L3 cache will be highly beneficial for the current Stellaris. This is just a guess though. Not so easy to test this sort of thing though, unfortunately.




If you wanted an AMD system specifically for Stellaris then the almost released "desktop Renoir" processors might work out better than similarly priced alternatives. They're basically a CPU and GPU (and I/O) combined on one chip and since Stellaris isn't particularly GPU intensive this should be good enough. They go in the standard AM4 desktop socket. Since the memory controller is on the same die as the CPU the memory latency is lower than with chiplets. Makes for nice, simple, low power, relatively cheap desktop systems with a lot of brute CPU power. Plenty of upgrade paths too, eg adding dedicated graphics card later if necessary.


PS Agree with what others have said: If Stellaris was properly multi-threaded it would almost certainly become more memory bandwidth limited than memory latency limited. But that's not something we need to worry about right now.

I actually did see the Renoir performance but 49ns latency on one of the most insanely tuned sticks of ram I've ever seen wasn't especially compelling. Comet lake can see latencies as low as 32-29ns with that kit with a tweak of the ring bus. Still, it's a pretty substantial gain from chiplet.
 
Many thanks for doing this. If you'd asked me to guess if Stellaris is more likely to be latency limited than bandwidth limited I would have guessed latency limited but it's good to know for sure.

Side note: it's possible that a particularly large L3 cache will be highly beneficial for the current Stellaris. This is just a guess though. Not so easy to test this sort of thing though, unfortunately.




If you wanted an AMD system specifically for Stellaris then the almost released "desktop Renoir" processors might work out better than similarly priced alternatives. They're basically a CPU and GPU (and I/O) combined on one chip and since Stellaris isn't particularly GPU intensive this should be good enough. They go in the standard AM4 desktop socket. Since the memory controller is on the same die as the CPU the memory latency is lower than with chiplets. Makes for nice, simple, low power, relatively cheap desktop systems with a lot of brute CPU power. Plenty of upgrade paths too, eg adding dedicated graphics card later if necessary.


PS Agree with what others have said: If Stellaris was properly multi-threaded it would almost certainly become more memory bandwidth limited than memory latency limited. But that's not something we need to worry about right now.
I've got the 4900HS with a 2060 GPU and the game runs great on it.
 
Side note: it's possible that a particularly large L3 cache will be highly beneficial for the current Stellaris. This is just a guess though. Not so easy to test this sort of thing though, unfortunately.

it's not too hard actually, all i would really have to do is get my hands on a 10900k and disable all but 4 cores and lock frequency to the same as my i3. L3 is shared across any all cores on intel even when cores are disabled. 20mb vs 6mb would be quite immediately obvious if amount of it mattered, but i don't have a 10900k and don't really plan to get one... and rocket lake with 16mb isn't going to be useful for making a comparison to my current cpu because it's a different architecture...

But if someone here does have a 10900k or even a 10700k/10600k, they could disable all but 4 cores in bios, lock their frequency to 4.1ghz (or 3.6ghz even) either via multiplier or max cpu state in windows power options and give the save a spin timing it the same way i outlined, then tell me what settings their memory ran at by checking cpu-z, I could match their test methodology while keeping almost everything equal except for L3. We could get an answer to this question pretty quickly. For what it's worth I also suspect the amount of looping instructions that can be kept in cache has a significant effect, and may be a primary determining factor of just how late in the game you can make it before performance really starts to get noticeably worse (i.e how long before becomes completely full of stuff that needs to be accessed many times every single tick)

I would also say to keep your ring ratio/cache ratio at 3300mhz to match me but eh, our ring bus is an entirely different physical size (4/6 cores for my SKU vs 10 for all 10th gen 'K' SKU's) so I think it's a pointless variable to try and control. It also probably won't have much impact on the outcome. Just knowing how much memory latency you have from aida64 benchmark would be enough, I could try and compensate for the different ring bus size with slightly looser timings, but even that isn't really necessary, if having much more l3 isn't outweighing the small penalty of having a larger ring bus, it's just not a very big factor in performance anyway.
 
Last edited:
Thank you for your work Paradox. I haven't played stellaris in quite a long time and it is the first time I was able to reach 2450 with...annoying l ag but not game breaking. Although soon it will get there.

You have come along way, but int he end the inevitable reaper to performance is your population system.I've had to remove habitats (we all know what the AI does with them) and ultimately it simply postpones the inevitable.

I run a 1000 star galaxy with .25habitable worlds, .5 primitives, 18 nations. Always been a fan of as close to 20 players as possible, it adds depth.

And so here is a suggestion, that most likely somebody came up with and perhaps wouldn't require massive overhaul of the system.
Given the check on every single pop IS what eventually causes the problem, it isn't what each pop does as much as the amount of pops.

Limit growth of pops by -X%

Increase the base production of each pop by +X%

Only have pop unemployment be checked every month, doesn't have to be every day.

Resource would remain the same, while cutting its numbers.


Thank you again for everything and have a wonderful weekend!
 
  • 4Like
  • 1
Reactions:
I didn't know about that.Sadly, it would simply bring the performance reaper sooner than it already arrives. And I can't go any lower on habitable worlds slider lol
or can I?

it actualy puts a pretty hard cap on habitats. on a 1000 star galaxy don't expect to see more than around 160 total before no country can make any more. There are some specific situations that could cause it to go slightly higher but it does hit a limit, usually long before 160. On smaller galaxies that limit will be much lower...
 
I want to start by saying I'm not a programmer but I do have an IT background. I was reading this morning about CUDA on nVidia GPUs and started wondering if the pop calculations that seem to slow everyone down could be calculated on the CUDA cores for a speed up. Another thing I wondered is if the galaxy could be divided into A number of "sectors" equal to the number of threads available on the CPU and each sector be assigned to a core to be calculated. I'm not using the word sector in the same way the game uses it. Also I created an AI game and ran it until 50 years from the end does anyone know how to automate running the rest of that game to the end and logging the amount of time that takes which would create a sort of benchmark?
 
I notice no one's actually properly tested the impact of memory in Stellaris, and In the wake of what we can expect to be increases in memory latency across the board with Intel cutting costs by adopting a chiplet i/o next generation, putting them more in line with AMD in that department, I decided to do what I could with my 10th gen cpu to figure out how much memory latency and perhaps bandwidth factors in to Stellaris performance, and maybe get some idea of how much degradation we can expect on future Intel CPU's, if any. Of course, my cpu is a monoblock with an integrated memory controller, so the only way I can enforce memory latency comparable to Ryzen or the upcoming intel 11th Gen is by completely destroying my memory timings.

I went ahead and started up a 1000 star game with 5x habitable worlds, low tech costs, max empires, all advanced starts, and let fast_forward run for awhile. When i got tired of waiting, we ended up almost 200 years into the game. No end game crisis yet, but there was millions of fleet power on the board and total galaxy population was pushing to just over 50k pops, which is of course insane. Safe to say that despite this only being like 184 years into the game, this represents a very intense late game scenario that most people probably won't reach in their average game unless they make a habit of using those stupid galaxy settings I punched in.

I then did 9 test runs, spread across 3 different RAM configurations, stopwatching how long it takes a single year to pass under very controlled conditions, everything kept the same from background programs to where my camera is and how zoomed it is on the galaxy map in oberver. GPU usage was minimal, cpu clock speed was consistently 4.15ghz on avg across all tests. Below are the details on my system, ram configurations, and the results:

intel core i3-10100
nvidia geforce gtx 960 4gb
16gb ddr4 dual channel 8gbx2 samsung b-die


Daily Config: 4400 18-18-18-42 420tRFC CR2

Memory latency 41.9ns
Read 57000 MB/s
Write 64000 MB/s
Copy 51000 MB/s

Tight 2933 Config: 2933 11-11-11-28 234tRFC CR2

Memory latency 42.1ns
Read 44000 MB/s
Write 44000 MB/s
Copy 39000 MB/s

Craptastic 2933 Config: 2933 28-28-28-64 770tRFC CR2

Memory latency ~63.4 NS (roughly equivalent to Zen+ or Matisse)
Read 38000 MB/s
Write 42000 MB/s
Copy 32000 MB/s


Test results one year on Fastest, starting on resume game with spacebar and ending on Day 2 of the following year (to account for missing most of the start of year lag for the initial year since I saved the game after it):


Daily Config - high bandwidth/low latency

test #1 - 5 minutes 34 seconds
test #2 - 5 minutes 35 seconds
test #3 - 5 minutes 35 seconds


Tight 2933 Config - low bandwidth/low latency

test #1 5 minutes 35 seconds
test #2 5 minutes 39 seconds
test #3 5 minutes 36 seconds


Craptastic 2933 config - low bandwidth/high latency

test #1 6 minutes 14 seconds
test #2 6 minutes 15 seconds
test #3 6 minutes 12 seconds


Conclusion: Stellaris is sensitive to memory latency, and introducing a large amount of it through timings that are much looser than you'd ever actually run introduces substantial performance degradation. Memory bandwidth matters much less, if at all, past a certain point, as there was no significant performance difference between my daily 4400 and tight 2933 configurations in spite of a large difference in read/write speeds; DDR5 is not likely to offer any gains in Stellaris, and 10th gen Intel MIGHT be better than 11th gen Intel, although that is a big assumption that will have to wait until I actually have an 11th gen chip in my system to test.

However, this ultimately may not be a good indicator of Ryzen performance difference vs Skylake. AMD has designed Zen with higher memory latency in mind, and it copes with this with this limitation with features like a larger cache subsystem, a wider reorder buffer, and a massively larger uOP cache than Skylake's, among other things. Worse memory latency still hurts, but I suspect it hurts a lot less for AMD than it does for Intel (which makes Intel's upcoming 11th gen's inclusion of a chiplet i/o but favoring Sunny Cove's relatively anemic l2/l3 cache configuration compared to Zen 3 puzzling, when they should have gone with Willow Cove's. They either managed not to incur a large latency penalty with their desktop EMIB/Foveros chiplet implementation, or they simply don't care of AMD destroys them in games with Zen 3.)

Anyway, if anyone wants to run their own tests on this save, I have included it in the attached file. It is vanilla, only utopia and horizon signal are needed to run the save, but i conducted my tests with all DLC active... I would be interested to see some results with something like a Ryzen 3 3300x.

I'm not so sure that slowing your memory down like that makes things more like Ryzen since Zen+ has a way larger cache that Comet Lake. I have a 2700x that I'm trying to find the best way to benchmark. If I get some results I'll let you know what I find.
 
So the guy named @AndrewT did an amazing job talking with me about my problem with the game. Big thumbs up for him.
Unfortunately the problem was not resolved and he asked me to join this megathread with all my findings.
Below are all my, relevant to the problem, messeges from the converstation with him and the files i attached.

The game is slow in fastest mode. It takes about 7-9s to finish one month. It's laggy too. I have like 30-40 unstable fps.
The game started doing that after 2.6 update. In 2.5.1 everything works perfectly fine.
My pc is more then enough to handle this game.
RTX 2070, 32 gigs of ram, 8-gen i7, ssd. My system is up to date, most of my drivers too.

I bought Federation DLC and wanted to come back to playing stellaris but it's unplayable. With the need to roll back to 2.5.1 i can't install any mods (Because there are only handfull that keep their old versions) and i can't play the DLC i bought.

I let the game play for a little bit and sometimes it works normal (kind of normal, only 65 fps, and still a little bit laggy) for a couple of seconds when nothing is moving and the camera is not moving too. But that was on a game with everything set to minimum, even AI.

Changing the qraphical settings doesn't change anything.

If i revert to 2.5.1 game runs perfectly fine. It's smooth and run fast at full speed (Speed 5).

I downloaded CK2 and it works even better than Stellaris 2.5.1.

It does affect 200 start galaxies even with 0 AI.

On stop time while not moving around the map i get 240 fps, but it's a bit laggy.
If i move around a map it drops to 160 and it's very laggy.
On normal play it drops further to 110-140, still laggy.
On fast it's 80-100 fps, laggy as hell.
And on fastest it's 40-60.
Moving fleets, opening UI like planet view decreases fps further. If i move my fleet on fastest it sometimes drops to 20fps.
It all was testet on Tiny (200 stars) map, with 0 AI and every option in game creation set to minimum.

Fps on slower times are good. But it's laggy. When i move araound the map it's snappy. When i move my fleets they are lagging too. Opening UIs sometimes takes some time (like 2-3 s). Overall playing the game feels like playing on old pc. I would understand that if the game was working like that on every version. But as i remember 2.7 was the Optimization Update and there it behaves the worst of it all.

About slow time passing on normal speed and fast speed. I dont really know if its slower. I never played on anything slower then the fastest speed.

Load times are normal. It takes about 20-30 s for the game to start (like in the past). So it's all normal here.

My game and Steam are one the same SSD drive. Documents are on different SSD. But as far as i know it should not matter. And i does not on 2.5.1.

I made 13.4 gb free space on C but it didnt matter. The game runs the same.

Logging out of Paradox on launcher didnt change a thing.

The game is unplayable for my at this state. I don't know how it would behave at mid od end game when the early game runs like a potato. I dont have that time on me to test it.

Below are 2 png files that are showing around 60s of CPU and GPU usage while the game was running on fastest speed.
I tested it for around 1 in game year and it was the same all the time.

I'm not OCing. My CPU has it's clock locked so it's impossible. My RTX doesn't need an OC.


Thanks you for all your help AndrewT :)
Tagging @Guraan because that's what AndrewT asked me to do (idk if that's how you tag here).
 

Attachments

  • CPU.png
    CPU.png
    26,7 KB · Views: 0
  • GPU.png
    GPU.png
    6,5 KB · Views: 0
  • setup.log
    891,3 KB · Views: 0
  • system.log
    6,4 KB · Views: 0
  • error.log
    267 bytes · Views: 0
  • game.log
    22,6 KB · Views: 0
  • DxDiag.txt
    83,6 KB · Views: 0
  • pdx_settings.txt
    303 bytes · Views: 0
  • settings.txt
    1,4 KB · Views: 0
  • 2Like
  • 1
  • 1
Reactions:
So the guy named @AndrewT did an amazing job talking with me about my problem with the game. Big thumbs up for him.
Unfortunately the problem was not resolved and he asked me to join this megathread with all my findings.
Below are all my, relevant to the problem, messeges from the converstation with him and the files i attached.

The game is slow in fastest mode. It takes about 7-9s to finish one month. It's laggy too. I have like 30-40 unstable fps.
The game started doing that after 2.6 update. In 2.5.1 everything works perfectly fine.
My pc is more then enough to handle this game.
RTX 2070, 32 gigs of ram, 8-gen i7, ssd. My system is up to date, most of my drivers too.

I bought Federation DLC and wanted to come back to playing stellaris but it's unplayable. With the need to roll back to 2.5.1 i can't install any mods (Because there are only handfull that keep their old versions) and i can't play the DLC i bought.

I let the game play for a little bit and sometimes it works normal (kind of normal, only 65 fps, and still a little bit laggy) for a couple of seconds when nothing is moving and the camera is not moving too. But that was on a game with everything set to minimum, even AI.

Changing the qraphical settings doesn't change anything.

If i revert to 2.5.1 game runs perfectly fine. It's smooth and run fast at full speed (Speed 5).

I downloaded CK2 and it works even better than Stellaris 2.5.1.

It does affect 200 start galaxies even with 0 AI.

On stop time while not moving around the map i get 240 fps, but it's a bit laggy.
If i move around a map it drops to 160 and it's very laggy.
On normal play it drops further to 110-140, still laggy.
On fast it's 80-100 fps, laggy as hell.
And on fastest it's 40-60.
Moving fleets, opening UI like planet view decreases fps further. If i move my fleet on fastest it sometimes drops to 20fps.
It all was testet on Tiny (200 stars) map, with 0 AI and every option in game creation set to minimum.

Fps on slower times are good. But it's laggy. When i move araound the map it's snappy. When i move my fleets they are lagging too. Opening UIs sometimes takes some time (like 2-3 s). Overall playing the game feels like playing on old pc. I would understand that if the game was working like that on every version. But as i remember 2.7 was the Optimization Update and there it behaves the worst of it all.

About slow time passing on normal speed and fast speed. I dont really know if its slower. I never played on anything slower then the fastest speed.

Load times are normal. It takes about 20-30 s for the game to start (like in the past). So it's all normal here.

My game and Steam are one the same SSD drive. Documents are on different SSD. But as far as i know it should not matter. And i does not on 2.5.1.

I made 13.4 gb free space on C but it didnt matter. The game runs the same.

Logging out of Paradox on launcher didnt change a thing.

The game is unplayable for my at this state. I don't know how it would behave at mid od end game when the early game runs like a potato. I dont have that time on me to test it.

Below are 2 png files that are showing around 60s of CPU and GPU usage while the game was running on fastest speed.
I tested it for around 1 in game year and it was the same all the time.

I'm not OCing. My CPU has it's clock locked so it's impossible. My RTX doesn't need an OC.


Thanks you for all your help AndrewT :)
Tagging @Guraan because that's what AndrewT asked me to do (idk if that's how you tag here).

I am using a 2070 with a Ryzen and I get a stable 60 FPS no stutters at all. I have vsync enabled, which it doesn't sound like you have it enabled and I am running in full screen. Make sure you're in full screen with vsync enabled and not running in a borderless window.
 
I am using a 2070 with a Ryzen and I get a stable 60 FPS no stutters at all. I have vsync enabled, which it doesn't sound like you have it enabled and I am running in full screen. Make sure you're in full screen with vsync enabled and not running in a borderless window.

"display_mode"={
value="fullscreen"
version=0
}
"fullscreen_resolution"={
value="1920x1080"
version=0
}
"vsync"={
enabled=yes
version=0
}

I'm running it in fullscreen with vsync enabled. So that's not the case.

The problem is that it started to happened since 2.6 update. 2.5.1 works perfectly fine even now.

What could be that taxing to bring stable 120 fps down to 40 fps in one update ?
 
  • 1
Reactions:
I notice no one's actually properly tested the impact of memory in Stellaris, and In the wake of what we can expect to be increases in memory latency across the board with Intel cutting costs by adopting a chiplet i/o next generation, putting them more in line with AMD in that department, I decided to do what I could with my 10th gen cpu to figure out how much memory latency and perhaps bandwidth factors in to Stellaris performance, and maybe get some idea of how much degradation we can expect on future Intel CPU's, if any. Of course, my cpu is a monoblock with an integrated memory controller, so the only way I can enforce memory latency comparable to Ryzen or the upcoming intel 11th Gen is by completely destroying my memory timings.

I went ahead and started up a 1000 star game with 5x habitable worlds, low tech costs, max empires, all advanced starts, and let fast_forward run for awhile. When i got tired of waiting, we ended up almost 200 years into the game. No end game crisis yet, but there was millions of fleet power on the board and total galaxy population was pushing to just over 50k pops, which is of course insane. Safe to say that despite this only being like 184 years into the game, this represents a very intense late game scenario that most people probably won't reach in their average game unless they make a habit of using those stupid galaxy settings I punched in.

I then did 9 test runs, spread across 3 different RAM configurations, stopwatching how long it takes a single year to pass under very controlled conditions, everything kept the same from background programs to where my camera is and how zoomed it is on the galaxy map in oberver. GPU usage was minimal, cpu clock speed was consistently 4.15ghz on avg across all tests. Below are the details on my system, ram configurations, and the results:

intel core i3-10100
nvidia geforce gtx 960 4gb
16gb ddr4 dual channel 8gbx2 samsung b-die


Daily Config: 4400 18-18-18-42 420tRFC CR2

Memory latency 41.9ns
Read 57000 MB/s
Write 64000 MB/s
Copy 51000 MB/s

Tight 2933 Config: 2933 11-11-11-28 234tRFC CR2

Memory latency 42.1ns
Read 44000 MB/s
Write 44000 MB/s
Copy 39000 MB/s

Craptastic 2933 Config: 2933 28-28-28-64 770tRFC CR2

Memory latency ~63.4 NS (roughly equivalent to Zen+ or Matisse)
Read 38000 MB/s
Write 42000 MB/s
Copy 32000 MB/s


Test results one year on Fastest, starting on resume game with spacebar and ending on Day 2 of the following year (to account for missing most of the start of year lag for the initial year since I saved the game after it):


Daily Config - high bandwidth/low latency

test #1 - 5 minutes 34 seconds
test #2 - 5 minutes 35 seconds
test #3 - 5 minutes 35 seconds


Tight 2933 Config - low bandwidth/low latency

test #1 5 minutes 35 seconds
test #2 5 minutes 39 seconds
test #3 5 minutes 36 seconds


Craptastic 2933 config - low bandwidth/high latency

test #1 6 minutes 14 seconds
test #2 6 minutes 15 seconds
test #3 6 minutes 12 seconds


Conclusion: Stellaris is sensitive to memory latency, and introducing a large amount of it through timings that are much looser than you'd ever actually run introduces substantial performance degradation. Memory bandwidth matters much less, if at all, past a certain point, as there was no significant performance difference between my daily 4400 and tight 2933 configurations in spite of a large difference in read/write speeds; DDR5 is not likely to offer any gains in Stellaris, and 10th gen Intel MIGHT be better than 11th gen Intel, although that is a big assumption that will have to wait until I actually have an 11th gen chip in my system to test.

However, this ultimately may not be a good indicator of Ryzen performance difference vs Skylake. AMD has designed Zen with higher memory latency in mind, and it copes with this with this limitation with features like a larger cache subsystem, a wider reorder buffer, and a massively larger uOP cache than Skylake's, among other things. Worse memory latency still hurts, but I suspect it hurts a lot less for AMD than it does for Intel (which makes Intel's upcoming 11th gen's inclusion of a chiplet i/o but favoring Sunny Cove's relatively anemic l2/l3 cache configuration compared to Zen 3 puzzling, when they should have gone with Willow Cove's. They either managed not to incur a large latency penalty with their desktop EMIB/Foveros chiplet implementation, or they simply don't care of AMD destroys them in games with Zen 3.)

Anyway, if anyone wants to run their own tests on this save, I have included it in the attached file. It is vanilla, only utopia and horizon signal are needed to run the save, but i conducted my tests with all DLC active... I would be interested to see some results with something like a Ryzen 3 3300x.

Hi sir, appreciated for your data. I am currently looking for an average performance in late game because my current set of computer is running to slow. And happened to only find the only post that including latest generation of CPU statistic which is awesome. Would you mind share how the late game runs with 10th gen i3?
 
  • 2
Reactions:
Hi sir, appreciated for your data. I am currently looking for an average performance in late game because my current set of computer is running to slow. And happened to only find the only post that including latest generation of CPU statistic which is awesome. Would you mind share how the late game runs with 10th gen i3?

I can't say my results are very useful for you since I was running stellaris with z490 motherboard and a very nice kit of ram, and I don't recommend you run the same setup, but it does run surprisingly well. Way better than my 4790k (probably due to memory) but you'll probably see a fair bit less performance than my i3 with a B or H series board and the same cpu due to 2666mhz ram cap. I can say someone running an fx-6200 took 11 minutes just to get through 6 months on that save I posted, going through the full year probably would have taken around 4 times longer than mine did.

However, as I've changed my mind on waiting for rocket lake, I am going to be doing some tests with that save that include using comparing the i3-10100 at 3.6ghz (turbo disabled) with 2133mhz non-overclocked ddr4 and compare those results to an i7-10700k with 4 cores disabled locked to 3.6ghz same 2133 memory, just so I can see the difference, if any, between having 6mb and 16mb of L3 when everything else is the same, which should also give a pretty good idea of how much better Zen 3 will perform than Zen 2 thanks to its monstrous 32MiB of L3 vs Zen 2's 16MiB. I also want see how the 10700k with all cores active and overclocked with tuned memory does just for fun. I'll post the results tomorrow.

As a last word, anyone waiting for 11th gen intel to upgrade is probably wasting their time. Having seen several recent leaked benches and having been able to make clock for clock comparisons to 10th gen, I have found it very disappointing in anything that didn't feature avx. It's nowhere near 20%, single digit gains on average and the occasional regression (?!) in memory sensitive workloads such as navigation, which is a comparable workload to pathing calculations in stellaris. It does worse in them. All in all it'll probably be break-even with 10th gen for stellaris since, as far as i know, stellaris does not even utilize AVX. Even if it did to some degree, it would default to avx2 since the avx512 clock offset would devastate the execution of every non-avx instruction in the game. You'd be better off pulling the trigger now on 10th gen or waiting one month to see what Zen 3 brings or just going the cheap route with a 3300x or 3600. Intel cannot squeeze any significant gains out of 14nm anymore, even with a new μarch.

Edit: They just delayed by delivery by several days so it's gonna take a little bit longer than I expected...
 
Last edited:
  • 1Love
Reactions: