AMD: “There’s no such thing as “full support” for DX12 today”

Discussion in 'Frontpage news' started by (.)(.), Sep 1, 2015.

  1. Noisiv

    Noisiv Ancient Guru

    Messages:
    8,230
    Likes Received:
    1,494
    GPU:
    2070 Super
    Is it fair to say that according to Oxide Maxwell does not do well in Oxide's Async shader implementation?

    And concluding just from that, without knowing internals, that Maxwell does not support Async shader in general - is a leap?
     
  2. DmitryKo

    DmitryKo Master Guru

    Messages:
    428
    Likes Received:
    152
    GPU:
    ASRock RX 7800 XT
    Again, this "async compute" is not an API feature - it's not an optional capability that can be exposed to the API programmer. This is a WDDM driver/DXGK feature which can improve performance in GPU-bound scenarios. Developers would just use compute shaders for lighting and global illumination, and in AMD implementation there are 2 to 8 ACE (asynchronous compute engine) blocks which are dedicated command processors that completely bypass the rasterization/setup engine for compute-only tasks. In theory this means additional compute performance without stalling the main graphics pipeline.

    Parallel execution is actually [post="5094415"]a built-in feature in the Direct3D 12 [/post] - it's called "synchronization and multi-engine". There are three sets of functions for copy, compute and rendering, and these tasks can be parallelized by runtime and driver when you have the right hardware. You just need to submit your compute shaders to the Direct3D runtime using the usual API calls, and on high-end AMD hardware with additional ACE blocks, you may use larger and more complex shaders and/or create additional command queues using multiple CPU threads. This will saturate the compute pipeline and you would still get fair performance gains comparing to the traditional rendering path.


    So when Oxide said they had to query hardware IDs for Nvidia cards then disable some features in the rendering path, it makes sense. When they talk about console developers getting 30% gains by using "async compute" - i.e. using compute shaders to accelerate lighting calculations in parallel to the main rendering stack - it makes sense as well.

    But when Oxide says that the 900-series (Maxwell-2) don't have the required hardware but the Nvidia driver still exposes "async compute" capability, I don't think they can really tell this for sure, because this feature would be exposed through DXGK (DirectX Graphics Kernel) driver capability bits, and these are driver-level interfaces which are only visible to the DXGI and the Direct3D runtime, but not the API programmer (and the MSDN hardware developer documentation for WDDM 2.0 and DXGI 1.4 does not exist yet).

    They are probably wrong on hardware support too, since Nvidia asserted to AnandTech that the 900-series have 32 scheduling blocks, of which 31 can be used for compute tasks.

    So if Nvidia really asked Oxide to disable the parallel rendering path in their in-game benchmark, that has to be some driver problem rather that missing hardware support. Nvidia driver probably doesn't expose the "async" capabilities yet, so the Direct3D runtime cannot parallelize the compute tasks, or the driver is not fully optimized yet... not really sure, but it would take me quite enormous efforts to investigate even if I had full access to the source code.
     
    Last edited: Sep 2, 2015
  3. sykozis

    sykozis Ancient Guru

    Messages:
    22,492
    Likes Received:
    1,537
    GPU:
    Asus RX6700XT
    Thank you for that explanation Dmitry. I was hoping to find something useful in this thread. Most of it appears to be the typical AMD vs NVidia garbage....
     
  4. rl66

    rl66 Ancient Guru

    Messages:
    3,924
    Likes Received:
    839
    GPU:
    Sapphire RX 6700 XT
    if you pay the port and a few 50Eur ok
     

  5. Mostafa Hijazi

    Mostafa Hijazi Guest

    Messages:
    705
    Likes Received:
    0
    GPU:
    Evga GTX 980 Ti SC+ SLI
    +1.

    Thanks alot Dimitry. The best comment I have read since this crap storm started.
     
  6. Turanis

    Turanis Guest

    Messages:
    1,779
    Likes Received:
    489
    GPU:
    Gigabyte RX500
    Thanks DmitryKO for sharing that info.

    Some useful Charts about GCN:
    http://news.softpedia.com/news/AMD-Hawaii-GPU-Diagram-Leaked-Shows-Four-Shader-Engines-390754.shtml

    Maxwell 2 can do Compute in the 31+1 taks limit,after that become much harder(impossible) do in time all that Compute queue (bottleneck).All Compute is do it in serial.
    GCN can do Compute in 64 tasks limit,that is much better than Maxwel and much consistent.If is more than 64 tasks ACE(Async Compute Engines), which take tasks as independent scheduler,can do without a penalty time more tasks.
    All Compute is do it in parallel.

    Some short explanations:
    http://www.overclock.net/t/1569897/various-ashes-of-the-singularity-dx12-benchmarks/1710#post_24368195

    https://www.reddit.com/r/pcgaming/comments/3j87qg/nvidias_maxwell_gpus_can_do_dx12_async_shading/
    Great discussions on that threads.

    And I really hope a Real answer from Nvidia about this issue,not like this answer: "misunderstanding" between our teams (Gtx 970 memory issue).
     
    Last edited: Sep 2, 2015
  7. alanm

    alanm Ancient Guru

    Messages:
    12,236
    Likes Received:
    4,437
    GPU:
    RTX 4080
    Currently as it stands, it does appear AMD-GCN are better positioned in making in making the most out of DX12 capabilities/features, esp in asynch compute/shaders. Dont care much at this point in time, but if Pascal doesnt address the issue, I could be switching sides.

    Of course too early to tell from one alpha stage game and whether other DX12 titles will similarly have an over-heavy reliance on async shaders as Oxide did. I just hope the best DX12 features/capabilities are put to best use by GPU makers and NOT have one side pressure devs to exclude things that legitimately improve a games looks, performance.
     
  8. Turanis

    Turanis Guest

    Messages:
    1,779
    Likes Received:
    489
    GPU:
    Gigabyte RX500
    And I really hope that new DX12 games dont disable Async Compute and not be "optimized" like some Gameworks titles,who dont run very well on some nvidia cards.

    Pascal will be great if we know something about him,inside architecture,but nothing yet.
     
  9. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    We know that 4 x 6 = 10

    j/k
     
  10. Dazz

    Dazz Maha Guru

    Messages:
    1,010
    Likes Received:
    131
    GPU:
    ASUS STRIX RTX 2080
    A few problems with that tho.

    1# DX12 is meant to streamline coding and make it more simplistic to help games come to market quicker and tap the power of the GPU without having to always talk to the driver and avoid the mess with DX11 which is pure driver optimisations. If their Async doesn't meet Microsoft's guide lines then it's hardly DX12 compliant!

    2# Lower overhead, games have direct access to GPU resources without the CPU overhead when talking to the driver otherwise it's just multithreaded DX11 rather than a low level API to metal.

    3# The developers say they have had alot more optimisations from nVidia than AMD in this game, although it's marketed as an AMD Game because it originally started out as DX11 and Mantle product with DX12 being added shortly after. Then again you can say every Unreal engine game out there is a nVidia marketed product and with it incorporating Gameworks directly into the newer engine which is also bias so i fail to see your argument there. Bias or not developers need to streamline their games so it runs across lots of different configurations you ignore these then your game really isn't going to sell very well, great way to destroy your company really.

    At Least with game works AMD can get round the performance issues by tweaking the tessellation in the drivers sadly if you have a nVidia product thats not Maxwell you are well screwed, just a paper weight at that point unless you disable the feature.

    Looking at this more i think the problem with nVidia is there are too many instructions for it to cope, i think RTS games will need a lot more compute power since each unit has multiple sources and when you add more and more units to the fold and remember there will be hundreds even thousands of units at once it will become very intensive far more than any FPS.
     

  11. Denial

    Denial Ancient Guru

    Messages:
    14,206
    Likes Received:
    4,118
    GPU:
    EVGA RTX 3080
    I don't really understand this or why this matters

    We have an RTS game with tons of compute, it's Ashes of Singularity. It has a benchmark that's literally designed to push massive compute utilizing ASync shaders. The Ti and the Fury X tie in performance. So how does Nvidia have a problem with too many instructions?

    Like ok, so maybe if we push even more instructions, like double that AoS does, maybe then AMD will show significant gains, but what game is doing that and why? RTS is like the most demanding for this type of workload and AoS is literally the pinnacle of it and it's a non-issue for Nvidia. So why do people keep pushing this like its going to have a major impact?

    On the flip side, lets just say Nvidia's tessellation performance sucked. Then all of a sudden a driver put out made it like 1203910293120x faster then AMD's. Then a game came out that that just had massive geometry smothered in tessellation, like literally hairworks for days, hair everywhere, x64 tessellation. And at the end of the benchmark for this game Ti - 41.7 FPS and the Fury X - 39.2. Would you care? Would any AMD owner care? Would the fate of their card suddenly be in jeopardy? Would their $650 purchase have gone to waste?
     
    Last edited: Sep 2, 2015
  12. Fender178

    Fender178 Ancient Guru

    Messages:
    4,194
    Likes Received:
    213
    GPU:
    GTX 1070 | GTX 1060
    To me this should not be a problem because DX12 is still in its early stages it will take a while for any card to have full features of it.
     
  13. Redemption80

    Redemption80 Guest

    Messages:
    18,491
    Likes Received:
    267
    GPU:
    GALAX 970/ASUS 970
    Exactly, i don't think there are any games out anytime soon that use all the DX12 features.

    Denial, i didn't know the 980ti and FuryX were pretty much tied when it came to AOS, never saw the two compared in any of the reviews, just found them there and there really is very little difference.
    If that is with Async Compute enabled on the FuryX and disabled on the 980ti, that make me curious how much of difference it is actually making.
    Would like to see a comparison on AMD hardware with it on and off to see how much real world performance is gained.
     

Share This Page