Log in or Sign up

Nvidia Has a Driver Overhead Problem

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by RealNC, Mar 15, 2021.

Page 5 of 14

Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

PrMinisterGR said: ↑

I find it incredible that nobody has tested the Threaded Optimization setting, and they keep talking about hardware that is or isn't there.
Click to expand...

The setting has no affect on vulkan and d3d12 to my knowledge.

dr_rus said: ↑

Still, the issue is certainly there and it's unknown what is the precise root cause of it for now. It certainly has nothing to do with any scheduling and from the data we have seems to be mostly limited to D3D12 hinting at this being an API issue for the most part.
Click to expand...

Nvidia is investigating.

Last edited: Mar 23, 2021

Astyanax, Mar 23, 2021

#81

Smough, Archvile82 and BlindBison like this.
PrMinisterGR Ancient Guru

Messages:

8,132

Likes Received:

974

GPU:

Inno3D RTX 3090

Astyanax said: ↑

The setting has no affect on vulkan and d3d12 to my knowledge.

Nvidia is investigating.
Click to expand...

I am getting different fps in SOTTR depending on whether it is on or off.
Haven't tested properly yet.

PrMinisterGR, Mar 23, 2021

#82
dr_rus Ancient Guru

Messages:

3,941

Likes Received:

1,048

GPU:

RTX 4090

PrMinisterGR said: ↑

This is completely wrong. It's even public that Ampere has four instruction schedulers. Because NVIDIA doesn't use the same terms as AMD, it doesn't mean that the hardware doesn't have them. Ampere is the most "GCN-like" hardware NVIDIA has ever made, it practically has specialized hardware for everything.
Click to expand...

That's eh not entirely accurate either.

1. Ampere is pretty much nothing like GCN. GCN doesn't have much specialized h/w either, it's very general purpose - which is why its good in compute and rather bad in graphics and perf/watt. There is a degree of convergence in RDNA and Pascal shading core designs but Turing and Ampere are a step beyond that.
2. SM level scheduling is fairly static on all GPUs these days and has been like this since GCN1/Kepler - this is what changed between Fermi and Kepler basically. "Static" here means "no scheduling happening since the instruction stream is pre-compiled by the driver to execute in an optimal fashion on the target h/w".
3. The only level where "Ampere is the most GCN-like h/w Nv even made" is global level scheduling (GigaThread engine) which was enhanced with h/w syspipes in Ampere which are kinda reminiscent of GCN's ACEs/HWSes. "Kinda" because a) not really since the design is different enough and serves different purposes (server grade h/w virtualization - ability to run up to 7 virtual GPUs on one GA100/GA102 GPU; less on smaller ones - one per GPC); b) because there's a distinct lack of disclosure on what it even does outside of CUDA compute. It may be used by graphics drivers or it may not be, this is unknown.

Anyway no amount of global API/driver side scheduling can explain such differences in CPU loads. Also worth noting that this difference seem to be present on all NV GPUs which support D3D12 - which means that whatever is the reason it should be the same on Maxwell, Pascal, Turing and Ampere - very different h/w architectures in many aspects.

dr_rus, Mar 23, 2021

#83

BlindBison, enkoo1 and PrMinisterGR like this.
Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

dr_rus said: ↑

Anyway no amount of global API/driver side scheduling can explain such differences in CPU loads. Also worth noting that this difference seem to be present on all NV GPUs which support D3D12 - which means that whatever is the reason it should be the same on Maxwell, Pascal, Turing and Ampere - very different h/w architectures in many aspects.
Click to expand...

Can probably replicate it on Fermi as well, if you find a D3D12 game that doesn't 99% the gpu (720p perhaps?)

Astyanax, Mar 23, 2021

#84
Nastya Member Guru

Messages:

185

Likes Received:

86

GPU:

GB 4090 Gaming OC

HUB posted a new video with some more insights:

Nastya, Mar 26, 2021

#85

enkoo1, Xtreme512, BlindBison and 1 other person like this.
Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.

Astyanax, Mar 26, 2021

#86

Smough, Archvile82, PrMinisterGR and 6 others like this.
BlindBison Ancient Guru

Messages:

2,420

Likes Received:

1,146

GPU:

RTX 3070

Astyanax said: ↑

HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
Click to expand...

Seriously? Wow, if true, that's not cool -- I mean, maybe I'm missing something, but those sound like comments I'd like to read lol.

How can one tell if comments were removed? Don't get me wrong, I'm glad they discovered this behavior and all/I enjoy their testing work, but maybe would've been better to present their ideas behind the "why" with a bit more "kernel of salt" in this case til an official response was out.

I'm looking forward to Nvidia's own investigation results.

Last edited: Mar 26, 2021

BlindBison, Mar 26, 2021

#87
Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

BlindBison said: ↑

Seriously? Wow, if true, that's not cool -- I mean, maybe I'm missing something, but those sound like comments I'd like to read lol.

How can one tell if comments were removed? Don't get me wrong, I'm glad they discovered this behavior and all/I enjoy their testing work, but maybe would've been better to present their ideas behind the "why" with a bit more "kernel of salt" in this case til an official response was out.

I'm looking forward to Nvidia's own investigation results.
Click to expand...

Youtube throws an error when you go to confirm an edit and when you refresh the comment is gone.

I was merely directing users to beyond3D's vast coverage of the RDNA architecture and pointing out that both vendors use a static 1 instruction per cycle per wave/warp design among other things XD.

also user dissemination of the geforce cards and how their architecture happened to change with each CC revision.

Astyanax, Mar 26, 2021

#88

BlindBison likes this.
PrMinisterGR Ancient Guru

Messages:

8,132

Likes Received:

974

GPU:

Inno3D RTX 3090

Nastya said: ↑

HUB posted a new video with some more insights:

Click to expand...

Let me guess, still no tests about threaded optimisations on or off.

Also, especially with Turing and Ampere nViDiA doESN't hAvE sCHEDuleRs is kind of tiring.

PrMinisterGR, Mar 26, 2021

#89
cucaulay malkin Ancient Guru

Messages:

9,236

Likes Received:

5,209

GPU:

AD102/Navi21

Astyanax said: ↑

HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
Click to expand...

they did the same to mine when I merely tried to refer to a $20-40 cooler test when they began using $40 added value for wraith stealth in cost per frame charts for r3600 in gaming (ridiculous,right?)
I noticed they also kept changing test locations for some games like witcher 3 back and forth whenever they felt convenient from more to less cpu bound.
it's a channel where the clickbait headline comes first,they'll find the way to obtain the data they want.

cucaulay malkin, Mar 27, 2021

#90

Archvile82 likes this.
Maddness Ancient Guru

Messages:

2,440

Likes Received:

1,739

GPU:

3080 Aorus Xtreme

Astyanax said: ↑

HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
Click to expand...

If that is true, then they just lost a lot of respect in my eyes.

Maddness, Mar 27, 2021

#91
Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

Maddness said: ↑

If that is true, then they just lost a lot of respect in my eyes.
Click to expand...

These are the same guys that crap on a user for demonstrating that HWU and GN benchmark results using FX processors are worse than they actually should be.

Bully bro's lol.

Astyanax, Mar 27, 2021

#92

Cryio likes this.
RealNC Ancient Guru

Messages:

5,127

Likes Received:

3,396

GPU:

4070 Ti Super

PrMinisterGR said: ↑

Let me guess, still no tests about threaded optimisations on or off.
Click to expand...

Question: If enabling threaded optimizations is better, why is it optional to begin with, and why would the default "auto" setting turn it off?

RealNC, Mar 27, 2021

#93

BlindBison likes this.
PrMinisterGR Ancient Guru

Messages:

8,132

Likes Received:

974

GPU:

Inno3D RTX 3090

RealNC said: ↑

Question: If enabling threaded optimizations is better, why is it optional to begin with, and why would the default "auto" setting turn it off?
Click to expand...

My guess is that "Auto" let's NVIDIA profile games. I cannot fathom that they talk about driver overhead without testing this setting. I might even test it myself, but I have a monster CPU so I don't know how useful it might be. I might be able to at least show different behaviors with it.

PrMinisterGR, Mar 27, 2021

#94
Astyanax Ancient Guru

Messages:

17,044

Likes Received:

7,380

GPU:

GTX 1080ti

RealNC said: ↑

Question: If enabling threaded optimizations is better, why is it optional to begin with, and why would the default "auto" setting turn it off?
Click to expand...

Auto is known to be broken in Cemu, the setting can change while the process is running and isn't necessarily set at application run time.

Last edited: Mar 27, 2021

Astyanax, Mar 27, 2021

#95
terror_adagio Member

Messages:

42

Likes Received:

13

GPU:

7950 GTX 512MB

Astyanax said: ↑

HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
Click to expand...

This is really weird. They are outright lying on Twitter too about it.

terror_adagio, Mar 27, 2021

#96
Undying Ancient Guru

Messages:

25,507

Likes Received:

12,904

GPU:

XFX RX6800XT 16GB

Im actually glad they are only one speaking about it and not beeing green enough to cover this under the rug like others payed of by nvidia channels do like DF.

Undying, Mar 27, 2021

#97
terror_adagio Member

Messages:

42

Likes Received:

13

GPU:

7950 GTX 512MB

Undying said: ↑

Im actually glad they are only one speaking about it and not beeing green enough to cover this under the rug like others payed of by nvidia channels do like DF.
Click to expand...

terror_adagio, Mar 27, 2021

#98

Smough, mirh and PrMinisterGR like this.
RealNC Ancient Guru

Messages:

5,127

Likes Received:

3,396

GPU:

4070 Ti Super

I don't follow youtubers much, so I haven't seen much previous work by this channel. I delved a bit into their backlog, and some videos are clearly clickbait. Case in point, I found one where they claim "DLSS is dead" and in the video they claim that NVidia's (very simple) sharpening shader is better than DLSS...

And they don't mention that we had many sharpening shaders to choosefrom through ReShade many years before nvidia added theirs to the driver.

So according to Hardware Unboxed, there's no need for DLSS. Just enable sharpening in the nvidia control panel. It's better.

RealNC, Mar 27, 2021

#99
cucaulay malkin Ancient Guru

Messages:

9,236

Likes Received:

5,209

GPU:

AD102/Navi21

RealNC said: ↑

I don't follow youtubers much, so I haven't seen much previous work by this channel. I delved a bit into their backlog, and some videos are clearly clickbait. Case in point, I found one where they claim "DLSS is dead" and in the video they claim that NVidia's (very simple) sharpening shader is better than DLSS...

And they don't mention that we had many sharpening shaders to choosefrom through ReShade many years before nvidia added theirs to the driver.

So according to Hardware Unboxed, there's no need for DLSS. Just enable sharpening in the nvidia control panel. It's better.
Click to expand...

7.7.2019
amd discovered sharpening according to hub
and it's better than dlss

cucaulay malkin, Mar 27, 2021

#100

(You must log in or sign up to reply here.)

Page 5 of 14

Share This Page