High DX11 CPU overhead, very low performance.

Discussion in 'Videocards - AMD Radeon Drivers Section' started by PrMinisterGR, May 4, 2015.

  1. SMS PC Lead

    SMS PC Lead Guest

    Messages:
    22
    Likes Received:
    0
    GPU:
    Titan X SLI
    For our mix of DX11 API calls, the API call consumption rate of the AMD driver is the bottleneck.

    In Project Cars the range of draw calls per frame varies from around 5-6000 with everything at low up-to 12-13000 with everything at Ultra. Depending on the single threaded performance of your CPU there will be a limit of the amount of draw calls that can be consumed and as I mentioned above, once that is exceeded GPU usage starts to reduce. On AMD/Windows 10 this threshold is much higher which is why you can run with higher settings without FPS loss.

    I also mentioned about 'gaps' in the GPU timeline caused by not being able to feed the GPU fast enough - these gaps are why increasing resolution (like to 4k in the Anandtech analysis) make for a better comparison between GPU vendors... In 4k, the GPU is being given more work to do and either the gaps get filled by the extra work and are smaller.. or the extra work means the GPU is now always running behind the CPU submission rate.

    So, on my i7-5960k@3.0ghz the NVIDIA (Titan X) driver can consume around 11,000 draw-calls with our DX11 API call mix - the same Windows 7 System with a 290x and the AMD driver is CPU limited at around 7000 draw-calls : On Windows 10 AMD is somewhere around 8500 draw-calls before the limit is reached (I can't be exact since my Windows 10 box runs on a 3.5ghz 6Core i7)

    In Patch 2.5 (next week) I did a pass to reduce small draw-calls when using the Ultra settings, as a concession to help driver thread limitations. It gains around 8% for NVIDIA and about 15% (minimum) for AMD.

    Yes the driver thread will have high CPU usage - in practice it never quite reaches 100% but if you use a simple profiling tool like Very Sleepy CS it's quite easy to see. I can probably write a guide to using this profiler to analyse driver bottlenecks, if that's useful.

    For Project Cars the 1040 driver is easily the fastest under Windows 10 at the moment - but my focus at the moment is on the fairly large engineering task of implementing DX12 support...
     
  2. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    @SMS PC Lead:
    ehm as draw calls scaling is nearly linear with cpu frequency i say that
    i7-5960@3.0ghz = 7000 and 3.5ghz = 8500 is just because of frequency increase...
     
  3. SMS PC Lead

    SMS PC Lead Guest

    Messages:
    22
    Likes Received:
    0
    GPU:
    Titan X SLI
    I was also responsible for the Xbox One DX11 implementation, so I can say definitively that given a light-weight/low overhead "driver" GCN is very (very) fast. The key thing on the CPU side is multi-threading scaling - on Project Cars, 4 of the 6 cores are used for rendering command-buffer construction/submission (aka deferred contexts) and the speed-up over using a single core is around 3.5x. I'd predict that with DX12's much improved MT scaling that AMD's cheap 8-core CPUs with the right engine are going to fair pretty damn well, bang for buck, in this new ERA.
     
  4. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    in DX12 feature test or under mantle FX-8xxx is comparable with i7-4xxx
    problem is that these features are like 1-2 years from now and we need AMD to look into DX11 before that (or their market share will be 0% when DX12 arives)
     

  5. Yecnot

    Yecnot Guest

    Messages:
    857
    Likes Received:
    0
    GPU:
    RTX 3080Ti
    :infinity:
     
    Last edited: Jul 7, 2015
  6. oGow89

    oGow89 Guest

    Messages:
    1,213
    Likes Received:
    0
    GPU:
    Gigabyte RX 5700xt
    I don't know for sure if you are really from the Czech, but either way, i am truly jealous. You guys have hands down the most beautiful women in eu, and they look so innocent. You look at them, and you just want to put them in a glass box like a limited edition comic book or figurine. If the theory of that this world is a hologram like the matrix, then czech has the best graphics and character details. :D

    I know this has nothing to do with topic, but there is already 36 pages, and on each one, you will read AMD needs to improve cpu overhead.
     
  7. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    ok so there is 4% increase from windows 10
    @oGow89: CZ girls are TOP
     
  8. SMS PC Lead

    SMS PC Lead Guest

    Messages:
    22
    Likes Received:
    0
    GPU:
    Titan X SLI
    The i7-3930k/Sandy Bridge has lower IPC and memory bandwidth than the i7-5960X Haswell... so I as said - it's not exact but the Window 10 API consumption rate is higher.
     
  9. sammarbella

    sammarbella Guest

    Messages:
    3,929
    Likes Received:
    178
    GPU:
    290X Lightning CFX (H2O)
    You are right games are...DX11 games.

    DX12 AMD driver performance is on par or even better than Nvidia but almost all games are and will be DX11 for the next 1-2 years.

    Who is buying a GPU to play games in 2 years?

    If AMD was unnable to improve the DX11 performance to the Nvidia driver level untill now fighting only in the DX11 "field" , i can't see how they will be able to do it from now fighting in DX11 AND DX12 "fields".
     
  10. SMS PC Lead

    SMS PC Lead Guest

    Messages:
    22
    Likes Received:
    0
    GPU:
    Titan X SLI
    You forgot the 10-15% from IPC etc!

    @oGow89 : I always found CZ girls high maintenance in my Guildford clubbing days...
     

  11. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    its not 15% more like 5% max :D
    mem bandwidth difference in this case is close to 0
     
  12. SMS PC Lead

    SMS PC Lead Guest

    Messages:
    22
    Likes Received:
    0
    GPU:
    Titan X SLI
    Ok, I can't be arsed to argue but haswell vs sandy-bridge is more than 5%. (Haswell vs Ivy-Bridge averaged 8% in most reviews)

    Anyway, I'll try and do some profiling grabs of the driver thread utilisation on both Operating Systems this week to refine my got home from work numbers - I'm finishing off a large Oculus Rift update at the moment, but once that's done I have some down time for fun with drivers.
     
  13. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    yep im interested in this results also please share how to set profiler to see which part of driver is bottlenecking :)
     
  14. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    I have stable 120fps in HoTS (max details), but GPU utilization is quite high considering that it is simple top-down view game.
    I can try SC2 too (I was not playing it for quite some time).

    But as guys above mentioned, AMD needs to feed this thing, because people are not happy if 980Ti does 190fps and Fury X only 160fps on 1080p.
     
  15. theoneofgod

    theoneofgod Ancient Guru

    Messages:
    4,677
    Likes Received:
    287
    GPU:
    RX 580 8GB
    X1 will speed up DX12 game development. If it was just PC I'd agree.
     

  16. sammarbella

    sammarbella Guest

    Messages:
    3,929
    Likes Received:
    178
    GPU:
    290X Lightning CFX (H2O)
    Yep, X1 is one of the main MS objectives for his new unified OS ecosystem (phone, tablets, Laptop, Desktop) around W10 and DX12.

    A little twist to the Mantle mythology:

    Maybe DX12 was not developed thanks to AMD's Mantle but in response to bad AMD performance in DX11 games, Xbox One has AMD GPUs like PS4.

    :D
     
  17. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    i have no doubt that mantle was developed because AMD sucks at DX11 performance :)
    but they keep telling us that DX11 is so bad it cant be improved and bam there is fixed driver by nvidia...
     
  18. sammarbella

    sammarbella Guest

    Messages:
    3,929
    Likes Received:
    178
    GPU:
    290X Lightning CFX (H2O)
    What do you expect from them?

    AMD is not going to do a press conference announcing they are unnable to fix his drivers DX11 performance!

    :D
     
  19. OneB1t

    OneB1t Guest

    Messages:
    263
    Likes Received:
    0
    GPU:
    R9 290X@R9 390X 1050/1325
    dunno maybe just decompile and steal it from NVIDIA? :D please dont tell me there is no industrial espionage between these big graphic players

    or hire some competent driver team (and fire that India outsource team which they prolly using now :D)
    as this is main reason why they loosing such big piece of market share

    or just make driver opensource even for windows and let comunity see if there is something to be done to make it better

    they also managed to write new api with totally new features from zero so they dont lack clever guys..
     
    Last edited: Jul 8, 2015
  20. sammarbella

    sammarbella Guest

    Messages:
    3,929
    Likes Received:
    178
    GPU:
    290X Lightning CFX (H2O)
    Remember AMD is the one and only "ethical" GPU manufacturer who doesn't fool his customers with techniques like artificial restriction in his drivers (VSR, frame rate limiter, SMAA, DX11 performance) for is own "older" GPUs.

    No AMD is no unnethical like "evil" Nvidia (Gameworks, 3.5Gate...)

    Maybe they should hire some HIGH PERFORMANCE CODERS like asder00 or VBS who are able to make this "hardware" limits dissapear in a few minutes...by modding the AMD software (drivers)!

    :D

    About open source:

    Is not AMD "force" behind the new OpenGL?
     

Share This Page