Discussion on Async Compute from AMD Prospective

Discussion in 'Videocards - AMD Radeon Drivers Section' started by Eastcoasthandle, Sep 20, 2018.

  1. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,962
    GPU:
    XFX 7900XTX M'310
    Hmm that's interesting, I have even more reading to do now but it's clear there's differences between AMD and NVIDIA's approach possibly also affected by support for certain features or lack thereof and then also how the hardware and drivers affect overall workload such as higher burden on the CPU though with the benefit of improving performance if the CPU can keep up.

    Haven't delved into the CPU side of things much at all either, there's utilities to really break down what's going on though so that might be fun to attempt with some of the current games and seeing what's going where in the numerous threads that modern game engines utilize.

    EDIT: Hmm bit of a short reply but I'm liking what I'm reading or watching so far, quite fascinating really. Will be fun to study this further. :)
     
    Eastcoasthandle likes this.
  2. WareTernal

    WareTernal Master Guru

    Messages:
    269
    Likes Received:
    53
    GPU:
    XFX RX 7800 XT
    Where did you get "Fiji have 8 CUs"? Either I missed something, or you've used the wrong term. Did you mean 8 ACEs?

    Fiji(and Tonga) = GCN 1.2(3rd gen)
    Polaris = GCN 1.3(4th gen)
    Vega = GCN 1.4(5th gen / aka 'Next-Generation Compute Unit')

    Fiji has 64 CU. Vega 64 has 64 CU, Vega 56 has 56 CU. All GCN have 64 SP per CU.
    From AMD - "Compute Units: Discrete AMD Radeon™ and FirePro™GPUs based on the Graphics Core Next architecture consist of multiple discrete execution engines known as a Compute Unit (“CU”). Each CU contains 64 shaders (“Stream Processors”) working in unison."

    Fiji has 8 ACE's (Asynchronous Compute Engines) and 4 SE (Shader Engines). 16 CU per SE, and 64 SP per CU.

    This video may be a better illustration than the red lines
     
    Jackalito and OnnA like this.
  3. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
    Indeed it is. One thing that is still not 100% clear for me and would like help with is the following:

    Q: Do we see the frostbite engine as a "console port" since it works well on consoles?

    Dice's BF1 DX12 support isn't very good. Yet, I don't want to say that because it's not clear what is causing the problem. But no one I know who plays with me uses DX12. We all say the same thing. It causes low frame rates, hitching, stuttering, pausing and other game breaking slow downs where it's simply not a usable API and the same reported for BFV. It's not a AMD nor a nV issue as both have it and described the exact same problem(s) with it.

    Other game engines (other than the frostbite engine) have worked pretty well with DX12 and Vulkan on AMD's side. BF1 is the only game I know that has such an issue and it's been reported to Dice on multiple times with no fix.

    One point of interest though is that the frostbite engine "works" in FIFA 19 although with a decent frame rate drop. But its still smooth

    FIFA 19 - DX11 vs DX12 & 1080p vs 4K - FPS Test [GTX 1060 6GB | i5 7500]




    DX11 vs. DX12 - MADDEN NFL 19 - Benchmark PC 1080p Ultra

    I believe Madden is on Frostbite as well. But not 100% sure yet.



    FIFA 18. DirectX 11 vs. DirectX 12. Frostbite DX12 still sucks.


    Keep in mind this is FIFA 18 but the same performance is still observed in FIFA 19. The author of this video noticed the exact same behavior mentioned before. Yes, DX12 will lower CPU utilization using AMD Video Cards but it's not gaining AMD better performance. Yet the same thing is observed with Geforce users (lower FPS in DX12. IMHO, it's not clear if the async compute in the Frostbite engine is using parallel or concurrent. Perhaps a combo of both?? But I still would like insight to this.

    I didn't bother including BF1 as it's DX12 issue is already known. Any insight on this is appreciated.

    I do hope that the single thread performance era is drawing to a close for a more then 2 core approach. DX12/Vulkan are
     
    Last edited: Sep 23, 2018
  4. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT

    When Dx12 works, it works for AMD GPUs, unlike the Frostbite Engine.
    Although only a few FPS higher AMD GPUs do benefit from DX12 use of more cpu cores being used.





    PASCAL

    As you can see, once again, that Pascal GPU utilization decreases in DX12 (should be 100%).



    Shadow of the Tomb Raider GTX 970 - GTX 1060 - GTX 1070 DX11 vs DX12 1080p TAA Maxed FPS COMPARISON



    I included this for transparency.
     
    Last edited: Sep 23, 2018

  5. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    WareTernal, yup misspelled.
    Already corrected :eek:

    Here:
    We have Compute Units aka CU
    and additional PowaH in terms of Asynchronous Compute Engines aka ACE.
    For Vega you need to add "n" in front of'em -means NextGen- e.g. nCU or nACE

    Fiji uArch:

    [​IMG]

    Vega uArch:

    [​IMG]

    PS.
    Raja K. said that Vega is completely new uArch (the Peak of ATI Tech) in AMA on Reddit.
     
    Last edited: Sep 23, 2018
  6. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    Frostbite and magic of it's DX11.1 (Yes 11.0 is nV only, they don't have HW Sheduler as stated in other posts) and DX12.0
    RenderDevice.Dx11Dot1Enable 1 -> this Tweak is good only for Win7/Win8

    As such Frostbite don't need DX12 ASAP for good Radeon utilisation.
    DX11.1 (developed by MS with colaboration w/ATI) fully ultilise HW Sheduler found in GCN/nGCN uArch and can scale well - up to 8 Threads (more threads needs in-house add. work or DX12.0 12.1)
    Every time you point at DX11.0 game, is run under DX12_Feature lev 11.1 in Windows 10 (THX goes to MS for helping hand)
    That's why you have that good pCars performance under WinX compared to Se7eN :D

    My bro have GCN 1.0 280X with Phenom II x4 and pCars runing OK for him w/ High/Medium

    Note no1:
    Even pesky UE3/UE4 can benefit from that.
    Also new instances of UE4 are ready in-House for HW sheduler as well as SW one -> so everybody happy (porting can take less time now THX to this and games runs well on Consoles & PC.

    Good example is new Vampyr UE4 game -> runing like a charm on my Vega (64-70Chill no drops)
    and list can go on, and on.

    Note no2.
    I can't tell You more, but wait for December upgrade :D 18.12.1 WHQL
    GCN uplift continues, many of us remebers -A.I. Catalyst-
    Best for Vega but not only ;)

    [​IMG]
     
    Last edited: Sep 23, 2018
    Eastcoasthandle likes this.
  7. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
    oooohhh that explains why DX11 in BF1/V is so good. It's 11.1. This time it was keep under radar? I still recall DX10.1 and assassins creed..
    But I still don't get what the deal is with DX12. But from what you explained DX12 isn't needed as DX11.1 address the most important thing AMD GPU owners want, performance.


    That's good to know. Really good to know. Because they are talking about improvements to Vulkan support.
     
  8. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    It's more like DX12_11.1 internally
    DX9c game is like this nowdays in WinX 12_9.3 etc.
    API in windows works for us like this -> Base is 12.1 then it scales down based on feature level used by Game or soft.

    Since BF4 BTW, but real gains was in Mantle back then:
    My very old config, x6 CPU + ATI 7970GHz Ed on 22" Compaq CRT
    I was gaming BF4/H always in Mantle (Win7 times)
    Screen made with RadeonPRO -> it reads API info.

    [​IMG]
     
    Last edited: Sep 24, 2018
  9. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
    Good to know. I do recall mantle working a whole lot better for me in BF4 using win7.
    It's rumored that Polaris 30 is coming out soon. If true I have to wonder what efficiencies are had using A/C over the rest of AMD's gpu stack.
    I have to wonder if Polaris 30 is going to beat out Fury X this time around too if true.
     
  10. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    IMO it will trade Blows with 1080p and have ~3-5FPS lower perf. in 1440p
    If done right, then it can land right beneath Vega 56
    -> Cheaper & not too slow (149-249€ range this time around? with DX12.0 GCN or 12.1 nGCN support will be nice)
     

  11. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
    Here is the take away from the video found at the bottom of this post:


    [​IMG]

    The above cause what you see below with GCN (reduced utilization of the CUs thus lower performance).

    [​IMG]



    However, if game uses DX12/Vulkan





    [​IMG]

    The above causes what you see with GCN below (Full Utilization and thus better performance)


    [​IMG]


    In other words:

    [​IMG]




    Here is an illustrative example of how Nvida and AMD handle Graphic/Compute:
    [​IMG]




    The below is an important note to why AMD went GCN:

    [​IMG]



    I found another video, albeit dated.
     
    Last edited: Sep 27, 2018
  12. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
  13. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
    I know right ? :p

    This explains a whole lot and puts into prospect the whole GCN thing. AMD was banking that at some future point, through their console dominance async computing in games would a 'norm'.

    However, as shown with Polaris, they needed to improve on GCN. If there is a next gen Navi gaming GPU and it is still GCN I think we will see vast improvements to GCN. Probably with less CU's for better OC perhaps? It could be more dynamic or AI in a way of how it will handle CU's. Well if Sony has anything to do with it. I think that's what we might see.

    But in the meantime the move for game engines to use async compute is where AMD would like to see the PC gaming market go. Console market has but PC market has not thus why we have such crappy ports IMO.
     
  14. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    That CPU thing on nV side... that's good old nVidia's SW magic preparation of frames before they send them to GPU. AMD should have done same. At least in any game which has more than 1 frame in rendering queue.
    Because when Game pushes frame data for frame which is not being rendered yet, there is time for CPU to do at least some geometry discard. And today with cheap 6C/12T chips on AMD side... it is kind of stupid not to use AMD's own ecosystem they are making.

    (Oh, wait... Next Big Driver Thing List may include it too.)
     
  15. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    All you need to do (for such a game) is turn on Enchanced Sync and You need to have FreeSync monitor.
    All Problems with Old Games just non exist anymore :D
     
    Last edited: Oct 19, 2018

  16. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT


    @4:04 the Vega 64 competes very well with the 2080 in Forza Horizon 4 which is one of the games that utilizes GCN Uarch. As the engine uses async compute.
     
  17. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,839
    GPU:
    TiTan RTX Ampere UV
    and smaller chip ;)
     
  18. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT


    Are we shocked yet? This really shows what a GCN card can do when it's not hindered.
     
  19. Eastcoasthandle

    Eastcoasthandle Guest

    Messages:
    3,365
    Likes Received:
    727
    GPU:
    Nitro 5700 XT
  20. Seren

    Seren Guest

    Messages:
    297
    Likes Received:
    16
    GPU:
    Asus Strix RX570
    I wouldn't take these results too seriously, it doesn't look like DX12 is better on neither AMD or Nvidia still... It might change when they patch in RTX though.
     

Share This Page