AMD: Asynchronous shaders in GCN handy with DirectX 12

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Mar 31, 2015.

  1. Hilbert Hagedoorn

    Hilbert Hagedoorn Don Vito Corleone Staff Member

    Messages:
    48,317
    Likes Received:
    18,405
    GPU:
    AMD | NVIDIA
  2. Turanis

    Turanis Guest

    Messages:
    1,779
    Likes Received:
    489
    GPU:
    Gigabyte RX500
    "All graphics cards based on GCN architecture can now handle multiple command instructions and data flows simultaneously which is managed by compute-engines called ACEs. Each queue can pass instructions without the need to wait on other tasks. That will keep your GPU 100% active as the work-flow is prioritized and thus always available.

    Per GPU eight ACE (Asynchronous Compute Engine) units are available which can manage eight waiting queues with direct access to the L2 cache and which is called 'global share data'. The really are multiple advantages to be found here, an overall more efficient rendering experience meaning a higher FPS, but due to the decreased latency this is way more optimal for Virtual Reality gaming."

    Yeah GCN all the way, compute all the way, double precision all the way. :)
    Will see in the future if the devs can handle ACE (Asynchronous Compute Engine).Can't wait.
     
  3. OnnA

    OnnA Ancient Guru

    Messages:
    17,787
    Likes Received:
    6,687
    GPU:
    TiTan RTX Ampere UV
    Yep ! thats GCN :banana:
    And of course Devs can handle (must) because of PS4 and XboX 1
    So the porting will be much easier + we PC gamers have better Gaming exp. overall
    So DX12 (with GCN 12_3) will benefit all MS platforms.
    Finger crossed :nerd:
     
  4. Noisiv

    Noisiv Ancient Guru

    Messages:
    8,230
    Likes Received:
    1,494
    GPU:
    2070 Super
    http://www.hardware.fr/news/14133/gdc-d3d12-amd-parle-gains-gpu.html
     

  5. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    15,636
    Likes Received:
    9,512
    GPU:
    4090@H2O
  6. Denial

    Denial Ancient Guru

    Messages:
    14,201
    Likes Received:
    4,105
    GPU:
    EVGA RTX 3080
    Maxwell can do 32 Queues, 290 can do 64.
     
  7. Enticles

    Enticles Guest

    Messages:
    242
    Likes Received:
    10
    GPU:
    Asus RTX 3070ti
    this sounds like a GPU version of hyperthreading to me. Happy to see the efficiency improving - and not just for team red or green. so everyone who has a compatible card regardless of vendor will be able to enjoy these improvements.

    can't wait to see the real life results :)
     
  8. Hootmon

    Hootmon Guest

    Messages:
    1,231
    Likes Received:
    6
    GPU:
    XFX THICC III Ultra
    Spiffy!
     
  9. sykozis

    sykozis Ancient Guru

    Messages:
    22,492
    Likes Received:
    1,537
    GPU:
    Asus RX6700XT
    No. Each shader processor will still only be able to execute 1 thread at a time whereas hyperthreading allows a single processor core to execute 2 threads. They're just finally implementing true, simultaneous multi-threading for GPU's.....and doing so through software. There are enough shader processors within a GPU where HyperThreading really isn't needed. My GTX970, for example, has 1664 shader processors (or CUDA cores as NVidia calls them).

    GPUs, under DX11 and OpenGL, are essentially "In-Order" processors where data is processed in the exact order it's received. With DX12 and "Vulkan", the GPU will function more like an "Out-of-Order" processor where instructions are prioritized and executed in order of importance.
     
  10. FerCam™

    FerCam™ Guest

    Messages:
    241
    Likes Received:
    4
    GPU:
    MSI Gaming GTX980
    humm nice to know, just got a gtx980 from a 290x RMA, due to the store not selling 290x anymore...
     

  11. Lane

    Lane Guest

    Messages:
    6,361
    Likes Received:
    3
    GPU:
    2x HD7970 - EK Waterblock
    it seems a bit different than that if i understand it well...

    Ok confirmed by Anandtech:

     
    Last edited: Mar 31, 2015
  12. anxious_f0x

    anxious_f0x Ancient Guru

    Messages:
    1,900
    Likes Received:
    610
    GPU:
    ASUS TUF RTX 4090
    It's an interesting way of doing things, let's hope it's actually utilised by developers on both PC and console, certainly puts the PS4 in a good position with it's 8 ACE'S.
     
  13. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    15,636
    Likes Received:
    9,512
    GPU:
    4090@H2O
    I'm still not entirely sure I get it... doesn't that mean 64 AMD vs a single one with Maxwell 2 cards? That would indeed look like an avantage for AMD...
     
  14. Dazz

    Dazz Maha Guru

    Messages:
    1,010
    Likes Received:
    131
    GPU:
    ASUS STRIX RTX 2080
    Doesn't Maxwell do this anyway but in hardware? it tries to prioritise traffic this can clearly be seen in the Maxwell version of the 970 since it puts frequent information on the fast memory partition and stored less used cache data on the reserved part. In essence nVidia should get a nice increase if it's done in software first then hardware can either change it on it's requirements or ignore it as being already efficient enough. AMD's solution doesn't do this so may benefit immensely. Time will tell tho.
     
  15. xg-ei8ht

    xg-ei8ht Ancient Guru

    Messages:
    1,820
    Likes Received:
    32
    GPU:
    1gb 6870 Sapphire
    PS4 has 8 ACES and 64 queues.
     

  16. Lane

    Lane Guest

    Messages:
    6,361
    Likes Received:
    3
    GPU:
    2x HD7970 - EK Waterblock
    Both are not related.. Here we are speaking about Asynchronous computing on the shader level.. ( not fix a bad design conception on memory access level how they can )

    I enjoin you to read the article from Anandtech: (Just dont look at the table number of queue, it seems they are wrong ( 8xqueue / Aces will bring a total of 64queue not, 8 ) http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading

    Ofc, its an architecture advantages from AMD, as basically GCN have been designed for and around it since the 1.0 iteration ( HD 7970).

    When for Nvidia they have now Maxwell who support it, but will indeed not been as good as GCN for it..

    But i think the time we see developpers who take advantage of it, certainly that 2016-2017 GPU's will be out ( so Pascal ).

    I can bet that on many front, Pascal will look really similar of GCN.
     
    Last edited: Apr 2, 2015
  17. Spets

    Spets Guest

    Messages:
    3,500
    Likes Received:
    670
    GPU:
    RTX 4090
    Going off the chart from the article you linked, it looks like Maxwell 2 has better support than GCN. Everything up to it though does lack in comparison.
    Would be nice to see developers taking advantage of this.
     
  18. Lane

    Lane Guest

    Messages:
    6,361
    Likes Received:
    3
    GPU:
    2x HD7970 - EK Waterblock
    The table number are in discussion right now.. it seems it is 8x queue / Ace, bringing a total of 64queue.. not 8.. ( following OpenCL GCN table ). ( but it is a little bit like play on the word). ( dont forget that AMD have then a second level. ( the Aces are not in the SM, at contrario of Nvidia ).

    Again for Maxwell this is the "computing queue".. Asynchronous shading need to use 3 different things to work simultaneously: graphics, computing and DMA (Copy).. this is where the problem lie today with DX11, you cant do both at once. AMD GCN can do this, because it have allways got the 3 type supported simultaneously ( with some limitation on first iteration ).
     
    Last edited: Apr 2, 2015

Share This Page