Nvidia Has a Driver Overhead Problem

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by RealNC, Mar 15, 2021.

  1. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    The setting has no affect on vulkan and d3d12 to my knowledge.

    Nvidia is investigating.
     
    Last edited: Mar 23, 2021
    Smough, Archvile82 and BlindBison like this.
  2. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,132
    Likes Received:
    974
    GPU:
    Inno3D RTX 3090
    I am getting different fps in SOTTR depending on whether it is on or off.
    Haven't tested properly yet.
     
  3. dr_rus

    dr_rus Ancient Guru

    Messages:
    3,941
    Likes Received:
    1,048
    GPU:
    RTX 4090
    That's eh not entirely accurate either.

    1. Ampere is pretty much nothing like GCN. GCN doesn't have much specialized h/w either, it's very general purpose - which is why its good in compute and rather bad in graphics and perf/watt. There is a degree of convergence in RDNA and Pascal shading core designs but Turing and Ampere are a step beyond that.
    2. SM level scheduling is fairly static on all GPUs these days and has been like this since GCN1/Kepler - this is what changed between Fermi and Kepler basically. "Static" here means "no scheduling happening since the instruction stream is pre-compiled by the driver to execute in an optimal fashion on the target h/w".
    3. The only level where "Ampere is the most GCN-like h/w Nv even made" is global level scheduling (GigaThread engine) which was enhanced with h/w syspipes in Ampere which are kinda reminiscent of GCN's ACEs/HWSes. "Kinda" because a) not really since the design is different enough and serves different purposes (server grade h/w virtualization - ability to run up to 7 virtual GPUs on one GA100/GA102 GPU; less on smaller ones - one per GPC); b) because there's a distinct lack of disclosure on what it even does outside of CUDA compute. It may be used by graphics drivers or it may not be, this is unknown.

    Anyway no amount of global API/driver side scheduling can explain such differences in CPU loads. Also worth noting that this difference seem to be present on all NV GPUs which support D3D12 - which means that whatever is the reason it should be the same on Maxwell, Pascal, Turing and Ampere - very different h/w architectures in many aspects.
     
    BlindBison, enkoo1 and PrMinisterGR like this.
  4. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Can probably replicate it on Fermi as well, if you find a D3D12 game that doesn't 99% the gpu (720p perhaps?)
     

  5. Nastya

    Nastya Member Guru

    Messages:
    185
    Likes Received:
    86
    GPU:
    GB 4090 Gaming OC
    HUB posted a new video with some more insights:

     
  6. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    HUB are deleting any comments that refer to and mention architectural information that discredits their claims about the device scheduling.
     
  7. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,420
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Seriously? Wow, if true, that's not cool -- I mean, maybe I'm missing something, but those sound like comments I'd like to read lol.

    How can one tell if comments were removed? Don't get me wrong, I'm glad they discovered this behavior and all/I enjoy their testing work, but maybe would've been better to present their ideas behind the "why" with a bit more "kernel of salt" in this case til an official response was out.

    I'm looking forward to Nvidia's own investigation results.
     
    Last edited: Mar 26, 2021
  8. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Youtube throws an error when you go to confirm an edit and when you refresh the comment is gone.

    I was merely directing users to beyond3D's vast coverage of the RDNA architecture and pointing out that both vendors use a static 1 instruction per cycle per wave/warp design among other things XD.

    also user dissemination of the geforce cards and how their architecture happened to change with each CC revision.
     
    BlindBison likes this.
  9. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,132
    Likes Received:
    974
    GPU:
    Inno3D RTX 3090
    Let me guess, still no tests about threaded optimisations on or off.

    Also, especially with Turing and Ampere nViDiA doESN't hAvE sCHEDuleRs is kind of tiring.
     
  10. cucaulay malkin

    cucaulay malkin Ancient Guru

    Messages:
    9,236
    Likes Received:
    5,209
    GPU:
    AD102/Navi21
    they did the same to mine when I merely tried to refer to a $20-40 cooler test when they began using $40 added value for wraith stealth in cost per frame charts for r3600 in gaming (ridiculous,right?)
    I noticed they also kept changing test locations for some games like witcher 3 back and forth whenever they felt convenient from more to less cpu bound.
    it's a channel where the clickbait headline comes first,they'll find the way to obtain the data they want.
     
    Archvile82 likes this.

  11. Maddness

    Maddness Ancient Guru

    Messages:
    2,440
    Likes Received:
    1,739
    GPU:
    3080 Aorus Xtreme
    If that is true, then they just lost a lot of respect in my eyes.
     
  12. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    These are the same guys that crap on a user for demonstrating that HWU and GN benchmark results using FX processors are worse than they actually should be.



    Bully bro's lol.
     
    Cryio likes this.
  13. RealNC

    RealNC Ancient Guru

    Messages:
    5,127
    Likes Received:
    3,396
    GPU:
    4070 Ti Super
    Question: If enabling threaded optimizations is better, why is it optional to begin with, and why would the default "auto" setting turn it off?
     
    BlindBison likes this.
  14. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,132
    Likes Received:
    974
    GPU:
    Inno3D RTX 3090
    My guess is that "Auto" let's NVIDIA profile games. I cannot fathom that they talk about driver overhead without testing this setting. I might even test it myself, but I have a monster CPU so I don't know how useful it might be. I might be able to at least show different behaviors with it.
     
  15. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Auto is known to be broken in Cemu, the setting can change while the process is running and isn't necessarily set at application run time.
     
    Last edited: Mar 27, 2021

  16. terror_adagio

    terror_adagio Member

    Messages:
    42
    Likes Received:
    13
    GPU:
    7950 GTX 512MB
    This is really weird. They are outright lying on Twitter too about it.
     
  17. Undying

    Undying Ancient Guru

    Messages:
    25,507
    Likes Received:
    12,904
    GPU:
    XFX RX6800XT 16GB
    Im actually glad they are only one speaking about it and not beeing green enough to cover this under the rug like others payed of by nvidia channels do like DF.
     
  18. terror_adagio

    terror_adagio Member

    Messages:
    42
    Likes Received:
    13
    GPU:
    7950 GTX 512MB
    [​IMG]
     
    Smough, mirh and PrMinisterGR like this.
  19. RealNC

    RealNC Ancient Guru

    Messages:
    5,127
    Likes Received:
    3,396
    GPU:
    4070 Ti Super
    I don't follow youtubers much, so I haven't seen much previous work by this channel. I delved a bit into their backlog, and some videos are clearly clickbait. Case in point, I found one where they claim "DLSS is dead" and in the video they claim that NVidia's (very simple) sharpening shader is better than DLSS...

    And they don't mention that we had many sharpening shaders to choosefrom through ReShade many years before nvidia added theirs to the driver.

    So according to Hardware Unboxed, there's no need for DLSS. Just enable sharpening in the nvidia control panel. It's better.
     
  20. cucaulay malkin

    cucaulay malkin Ancient Guru

    Messages:
    9,236
    Likes Received:
    5,209
    GPU:
    AD102/Navi21
    7.7.2019
    amd discovered sharpening according to hub
    and it's better than dlss
    [​IMG]
     

Share This Page