Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by RealNC, Mar 15, 2021.
The setting has no affect on vulkan and d3d12 to my knowledge.
Nvidia is investigating.
I am getting different fps in SOTTR depending on whether it is on or off.
Haven't tested properly yet.
That's eh not entirely accurate either.
1. Ampere is pretty much nothing like GCN. GCN doesn't have much specialized h/w either, it's very general purpose - which is why its good in compute and rather bad in graphics and perf/watt. There is a degree of convergence in RDNA and Pascal shading core designs but Turing and Ampere are a step beyond that.
2. SM level scheduling is fairly static on all GPUs these days and has been like this since GCN1/Kepler - this is what changed between Fermi and Kepler basically. "Static" here means "no scheduling happening since the instruction stream is pre-compiled by the driver to execute in an optimal fashion on the target h/w".
3. The only level where "Ampere is the most GCN-like h/w Nv even made" is global level scheduling (GigaThread engine) which was enhanced with h/w syspipes in Ampere which are kinda reminiscent of GCN's ACEs/HWSes. "Kinda" because a) not really since the design is different enough and serves different purposes (server grade h/w virtualization - ability to run up to 7 virtual GPUs on one GA100/GA102 GPU; less on smaller ones - one per GPC); b) because there's a distinct lack of disclosure on what it even does outside of CUDA compute. It may be used by graphics drivers or it may not be, this is unknown.
Anyway no amount of global API/driver side scheduling can explain such differences in CPU loads. Also worth noting that this difference seem to be present on all NV GPUs which support D3D12 - which means that whatever is the reason it should be the same on Maxwell, Pascal, Turing and Ampere - very different h/w architectures in many aspects.
Can probably replicate it on Fermi as well, if you find a D3D12 game that doesn't 99% the gpu (720p perhaps?)
HUB posted a new video with some more insights:
HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
Seriously? Wow, if true, that's not cool -- I mean, maybe I'm missing something, but those sound like comments I'd like to read lol.
How can one tell if comments were removed? Don't get me wrong, I'm glad they discovered this behavior and all/I enjoy their testing work, but maybe would've been better to present their ideas behind the "why" with a bit more "kernel of salt" in this case til an official response was out.
I'm looking forward to Nvidia's own investigation results.
Youtube throws an error when you go to confirm an edit and when you refresh the comment is gone.
I was merely directing users to beyond3D's vast coverage of the RDNA architecture and pointing out that both vendors use a static 1 instruction per cycle per wave/warp design among other things XD.
also user dissemination of the geforce cards and how their architecture happened to change with each CC revision.
Let me guess, still no tests about threaded optimisations on or off.
Also, especially with Turing and Ampere nViDiA doESN't hAvE sCHEDuleRs is kind of tiring.
they did the same to mine when I merely tried to refer to a $20-40 cooler test when they began using $40 added value for wraith stealth in cost per frame charts for r3600 in gaming (ridiculous,right?)
I noticed they also kept changing test locations for some games like witcher 3 back and forth whenever they felt convenient from more to less cpu bound.
it's a channel where the clickbait headline comes first,they'll find the way to obtain the data they want.
If that is true, then they just lost a lot of respect in my eyes.
These are the same guys that crap on a user for demonstrating that HWU and GN benchmark results using FX processors are worse than they actually should be.
Bully bro's lol.
Question: If enabling threaded optimizations is better, why is it optional to begin with, and why would the default "auto" setting turn it off?
My guess is that "Auto" let's NVIDIA profile games. I cannot fathom that they talk about driver overhead without testing this setting. I might even test it myself, but I have a monster CPU so I don't know how useful it might be. I might be able to at least show different behaviors with it.
Auto is known to be broken in Cemu, the setting can change while the process is running and isn't necessarily set at application run time.
This is really weird. They are outright lying on Twitter too about it.
Im actually glad they are only one speaking about it and not beeing green enough to cover this under the rug like others payed of by nvidia channels do like DF.
I don't follow youtubers much, so I haven't seen much previous work by this channel. I delved a bit into their backlog, and some videos are clearly clickbait. Case in point, I found one where they claim "DLSS is dead" and in the video they claim that NVidia's (very simple) sharpening shader is better than DLSS...
And they don't mention that we had many sharpening shaders to choosefrom through ReShade many years before nvidia added theirs to the driver.
So according to Hardware Unboxed, there's no need for DLSS. Just enable sharpening in the nvidia control panel. It's better.
amd discovered sharpening according to hub
and it's better than dlss