Asynchronous Compute

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by Carfax, Feb 25, 2016.

  1. Carfax

    Carfax Ancient Guru

    Messages:
    3,948
    Likes Received:
    1,442
    GPU:
    Zotac 4090 Extreme
    When will NVidia implement this in their drivers? I expected them to have it for the AotS beta 2 release today, but they still don't have it..

    This is kind of making me wonder whether NVidia is ever going to release it at all. They have made similar promises in the past such as MFAA for SLI, which turned out to be totally bogus...

    Although to be fair, asynchronous compute doesn't seem to be playing a major role in AMD's dominance in AotS. It seems there is something else responsible, because the most asynchronous compute accounts for is 20% extra on AMD's GPUs, and this is on the most demanding settings..
     
    Last edited: Feb 25, 2016
  2. khanmein

    khanmein Guest

    Messages:
    1,646
    Likes Received:
    72
    GPU:
    EVGA GTX 1070 SC
    if u wanna play this game go grab AMD gpu.
     
  3. Carfax

    Carfax Ancient Guru

    Messages:
    3,948
    Likes Received:
    1,442
    GPU:
    Zotac 4090 Extreme
    This is nothing to do with playing AotS. Asynchronous compute will be used in other DX12 titles coming down the pipeline, so I want to know when or if NVidia will support it.
     
  4. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    15,636
    Likes Received:
    9,512
    GPU:
    4090@H2O
    What do you expect us users here? If you follow the forums you know just as much as us :)

    And of course Nvidia says it will be implemented, no ETA so far though. What else should they say, it's not ready / usable currently, so I guess they can't do anything but saying it will be there, no matter if they will fix their performance with a patch, or just upcoming architectures (Volta).
     

  5. OrdinaryOregano

    OrdinaryOregano Guest

    Messages:
    433
    Likes Received:
    6
    GPU:
    MSI 1080 Gaming X
    You're expecting some forum users to know when that will happen? How many of us even know what entails the DX12 spec, let alone tell you whether or not it's a DX12 specific feature?

    If "AOTS" is supposed to be this game then all I read is:
    http://www.extremetech.com/gaming/2...ashes-of-the-singularity-directx-12-benchmark

    Waiting is all you can do.

    While you're at it, consider that what you're saying is very speculative, nobody knows how much one thing might matter in the wide view of things. For example, DirectX11's highlight feature - Tessellation. It was paraded around as the biggest thing ever and AMD cards were and still are RUBBISH when it comes to tessellation performance. Tessellation is used in games a lot but it's hardly something anyone cares for much.
     
  6. Carfax

    Carfax Ancient Guru

    Messages:
    3,948
    Likes Received:
    1,442
    GPU:
    Zotac 4090 Extreme
    Well you never know who might swing by. Some NVidia employees like Manuel are known to post on these forums.. :)
     
  7. Darren Hodgson

    Darren Hodgson Ancient Guru

    Messages:
    17,179
    Likes Received:
    1,500
    GPU:
    NVIDIA RTX 4080 FE
    As a GTX 980 Ti owner who is (was?) very much-looking forward to playing DX12 games at higher framerates than DX11, I have to say that I am concerned at the results of benchmarks in both Fable Legends and Ashes of the Singularity, which currently show either AMD cards with a huge lead or negative performance increases over DX11. It seems like NVIDIA card owners are not going to see big benefits from using DX12 on the basis of what I've seen so far.

    And surprisingly NVIDIA themselves do not seem to be in much of a hurry to support or showcase DX12, which is one of the reasons for it being delayed in ARK: Survival Evolved, I believe, from what I read. Makes me wonder if their hardware design is flawed, particularly as this asynchronous compute is supported only through software not hardware even on their newer Maxwell cards. Hmmmmmm...

    I suspect DX12 may just turn out of be another overhyped API to join the underwhelming DX10 which launched with Windows Vista but I hope I am wrong. Hitman is supposed to be getting DX12 support as is Rise of the Tomb Raider so we will see from those exactly what benefits we will get from that... if any.
     
  8. Carfax

    Carfax Ancient Guru

    Messages:
    3,948
    Likes Received:
    1,442
    GPU:
    Zotac 4090 Extreme
    Thing is, asynchronous compute isn't even a DX12 feature. There is no specific hardware implementation given by Microsoft that IHVs have to follow to be DX12 certified. Also, GPUs have been capable of doing asynchronous compute for years, it's only now that a Microsoft API is exposing that capability.

    NVidia's CUDA allows simultaneous computation of graphics and compute workloads for instance, and has for a long time.

    Makes me really wonder as well. From what I can tell, the majority of the performance gain in DX12 for the Radeons isn't coming from Asynchronous compute, but from the DX12 command buffers which allows greater CPU parallelism and thus more GPU utilization.

    But NVidia should have no problems with that, as seen in the Star Swarm DX12 benchmark that used the same engine, although a much earlier version.

    The only way these benchmarks makes sense for NVidia, is if you believe that Maxwell V2 is already getting maximum utilization in DX11, and so DX12 has no impact.

    For the Radeons, it's the opposite. They were vastly underutilized in DX11, and now with DX12, they have a new lease on life.
     
    Last edited: Feb 25, 2016
  9. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    15,636
    Likes Received:
    9,512
    GPU:
    4090@H2O
    Well I personally haven't seen any Nvidia reps in here in quite a long time, I'd be very interested in their statement too ;)


    As I personally don't think that dx12 will be a failed api, I'm still surprised Nvidia is keeping that quiet about anything. My personal guess is that while Maxwell (and probably Pascal) are excellent cards for dx11, I am afraid that they are utterly optimized for dx11 so that asynchronous functions have basically been ignored because of little use in dx11 so far. Only with architectures that are designed AFTER dx12 specs are around, we'll see Nvidia catch up. That's my personal guess.
     
  10. dr_rus

    dr_rus Ancient Guru

    Messages:
    3,876
    Likes Received:
    1,014
    GPU:
    RTX 4090
    There is no such thing as "not supporting asynchronous compute" as it is a core part of D3D12 specs. There are different implementations of it however all of which are correct if they provide the proper final result.

    Aside from that... What makes you think that other DX12 titles will even benefit from it? What makes you think that NV's GPUs will even benefit from it in these other titles?

    Could you point me to the Fable Legends benchmark comparing DX12 to DX11 on NV GPUs?

    As for the rest of what you're saying - the only thing which _may_ make a GPU limited scene render faster in DX12 compared to DX11(.3) is the multi engine aka async compute. There are no other technical reasons why a DX11.3 renderer should be inherently slower than a D3D12 renderer when GPU is the limiting factor.

    So what NV cards are showing is that their implementation of DX11 driver is so good that even a stress test as AotS can't push higher in DX12 than what NV's programmers have achieved in their DX11 driver. This is a benefit every NV card user has been enjoying for the last four years while AMD card users has been waiting for the new APIs which _could_ fix AMD's inability to create a decent DX11 GPU+driver.

    There's nothing strange in AMD cards getting performance boosts in DX12 while NV's cards are basically remaining on the same level as they are in DX11. AMD's DX11 driver is rubbish and their GPUs actually require a hack which is the concurrent compute execution to reach their peak performance potential. NV's GPUs are hitting the same peak left and right in straight graphics workload and thus running some compute on them concurrently with graphics may lead to performance decreases or lack of performance gains. I fully expect NV to keep the same approach in future architectures as being able to reach the peak without the necessity to run something in parallel is an advantage, not a problem of their architectures. They do need to make sure in the future that running compute concurrently with graphics won't lead to a performance loss though.

    In any case - yes, AMD's current h/w will benefit from DX12 more than NV's on average, especially in GPU limited scenarios. Mostly because their DX11 driver was and is, well, crap and partly because their GPUs _need_ to run something asynchronously to reach their peak performance. This is just how it is, if that's something which you can't handle then you can go and buy yourself a Fury or something.

    As for the benefits of DX12 in CPU limited scenarios - which is the main reason for DX12 to exist really - NV's h/w don't have any issues there:

    [​IMG]
     

  11. OrdinaryOregano

    OrdinaryOregano Guest

    Messages:
    433
    Likes Received:
    6
    GPU:
    MSI 1080 Gaming X
    Yeah I am aware of what Async Compute means but my point was that it's not really beneficial to ask this here on Guru3D, most people here are hardware enthusiasts not necessarily Rendering/Graphics API enthusiasts, Beyond3D might be the better place to talk about these things.
     
  12. Carfax

    Carfax Ancient Guru

    Messages:
    3,948
    Likes Received:
    1,442
    GPU:
    Zotac 4090 Extreme
    Nothing I suppose. I guess I've succumbed to the hype! :giggle2:

    *Edit* After thinking about it some more, I think that since the consoles have GCN hardware, asynchronous compute is literally almost guaranteed to play a prominent role in upcoming DX12 titles and engines.

    Devs need to use AC to squeeze as much juice as they can from the consoles, and so some of that effort will naturally flow over into the PC platform as well.


    But other DX12 titles that are coming have already announced they will use AC, such as Gears of War Ultimate. Rise of the Tomb Raider DX12 will likely incorporate it as well, since the Xbox One version uses it heavily.

    But as I mentioned before, AC cannot alone account for the gap between the Fury X and the 980 Ti in AotS.. Either NVidia's DX12 driver still needs further refinement when it comes to exploiting DX12's CPU enhancements, or the 980 Ti is already tapped out under DX11 and thus has no further headroom.

    The latter would surprise me greatly to be honest, if it were true.

    LOL I'm already subscribed to the DX12 performance thread on Beyond3d forum :nerd:
     
    Last edited: Feb 25, 2016
  13. dr_rus

    dr_rus Ancient Guru

    Messages:
    3,876
    Likes Received:
    1,014
    GPU:
    RTX 4090
    "Heavily" is a rather strong word for something which provides up to 20% of additional performance at best on a GPU which simply can't reach the same performance within single graphics workload. I don't expect for async compute to have a heavy impact on performance landscape especially as I think that NV will be able to somewhat "hack it" with drivers and will extract some performance from it as well eventually. In any case we're probably looking at ~5% performance increase on average which is hardly something to go bragging about.

    Also it's worth remembering that because of how difficult this feature is to implement properly across lots of different hardware configurations even on AMD's own hardware there are cases where running compute concurrently may lead to a performance loss:

    [​IMG]
    (Note the FuryX results in 1080p.)

    AotS performance is obviously skewed in favor of GCN h/w as it perform rather great on it even in DX11. This is a clear case of the engine being optimized for AMD first and most - which isn't really surprising considering where it started and how / by whom it was promoted at first. I'm waiting for less biased titles to arrive with DX12 support before drawing any conclusions.

    Why? Maxwell is a nice update of Kepler in h/w but from a logical point of view they are rather similar so Maxwell can easily be maxed out in DX11 after three years of Kepler driver improvements. It is also very possible that NV's DX12 driver still needs lots of work.
     
  14. khanmein

    khanmein Guest

    Messages:
    1,646
    Likes Received:
    72
    GPU:
    EVGA GTX 1070 SC
    now still got a lot games with DX11. maxwell is for early stage beta for DX12 while pascal is the entry level. Vulka will be mature with DX12**
     
  15. Barry J

    Barry J Ancient Guru

    Messages:
    2,803
    Likes Received:
    152
    GPU:
    RTX2080 TRIO Super
    I wonder if NVidia will off load Async work to 2nd GPU just like PhysX OR cpu if you don't have 2nd GPU (for software implementation )
     

  16. xermit

    xermit Member

    Messages:
    32
    Likes Received:
    0
    GPU:
    ASUS GTX 980ti @ 1470
  17. Yxskaft

    Yxskaft Maha Guru

    Messages:
    1,495
    Likes Received:
    124
    GPU:
    GTX Titan Sli
  18. MrBonk

    MrBonk Guest

    Messages:
    3,385
    Likes Received:
    283
    GPU:
    Gigabyte 3080 Ti
  19. dr_rus

    dr_rus Ancient Guru

    Messages:
    3,876
    Likes Received:
    1,014
    GPU:
    RTX 4090
    This would make zero sense whatsoever as even running the async compute queues serially on both GPUs should be a lot faster than dedicating a whole GM200 for the tiny compute jobs which are submitted asynchronously.
     
  20. dr_rus

    dr_rus Ancient Guru

    Messages:
    3,876
    Likes Received:
    1,014
    GPU:
    RTX 4090
    Asynchronous shading (aka D3D12 multi engines) is always enabled on all DX12 h/w. What may be disabled globally right now is the concurrent execution of asynchronous compute queues. Both concurrent and serial execution are perfectly within spec.

    I fully expect NV to enable support for a concurrent compute on a per-title basis as running unoptimized compute queues concurrently will lead to a worse performance compared to them running serially. Which means that they'll enable the feature only for titles where it will be of benefit to them.

    The only Kepler GPU which may theoretically support concurrent compute is GK110. I think that they'll just support Maxwell for this as supporting one old GPU from an old lineup to get ~5% of additional performance doesn't sound like something worth pursuing.
     

Share This Page