GCN Architecture More Friendly To Parallelism Than Maxwell

Discussion in 'Videocards - AMD Radeon' started by OnnA, Aug 23, 2015.

  1. OnnA

    OnnA Ancient Guru

    Messages:
    17,849
    Likes Received:
    6,739
    GPU:
    TiTan RTX Ampere UV
    Here is interesting article about Future of GCN line and DX12/Vulcan/Mantle API:

    Since the release of Ashes of the Singularity, a lot of controversy surrounded AMD’s spectacular results over NVIDIA’s underwhelming ones. Was this DX12 benchmark gimped in order to run faster on AMD’s hardware? Apparently not as Overclock.net‘s member ‘Mahigan’ shed some light on why there are so dramatic differences between AMD’s and NVIDIA’s results.

    What’s also interesting here is that Mahigan has provided a number of slides to back up his claims (which is precisely why we believe this explanation is legit).

    As Mahigan pointed out, Maxwell’s Asychronous Thread Warp can queue up 31 Compute tasks and 1 Graphic task, whereas AMD’s GCN 1.1/1.2 is composed of 8 Asynchronous Compute Engines (each able to queue 8 Compute tasks for a total of 64 coupled) with 1 Graphic task by the Graphic Command Processor.

    [​IMG]

    [​IMG]

    This basically means that in terms of parallelism, GCN GPUs should be able to surpass their direct Maxwell rivals, something we’ve been witnessing in the Ashes of the Singularity benchmark.

    It’s been known that under DX11, NVIDIA has provided better results than its rival. And according to Mahigan, this is mainly because NVIDIA’s graphics cards can handle better Serial Scheduling rather than Parallel Scheduling.

    “nVIDIA, on the other hand, does much better at Serial scheduling of work loads (when you consider that anything prior to Maxwell 2 is limited to Serial Scheduling rather than Parallel Scheduling). DirectX 11 is suited for Serial Scheduling therefore naturally nVIDIA has an advantage under DirectX 11.”

    Regarding the really curious results of DX11 and DX12 on NVIDIA’s graphics cards, Mahigan had this to say:

    “People wondering why Nvidia is doing a bit better in DX11 than DX12. That’s because Nvidia optimized their DX11 path in their drivers for Ashes of the Singularity. With DX12 there are no tangible driver optimizations because the Game Engine speaks almost directly to the Graphics Hardware. So none were made. Nvidia is at the mercy of the programmers talents as well as their own Maxwell architectures thread parallelism performance under DX12. The Devellopers programmed for thread parallelism in Ashes of the Singularity in order to be able to better draw all those objects on the screen. Therefore what we’re seeing with the Nvidia numbers is the Nvidia draw call bottleneck showing up under DX12. Nvidia works around this with its own optimizations in DX11 by prioritizing workloads and replacing shaders. Yes, the nVIDIA driver contains a compiler which re-compiles and replaces shaders which are not fine tuned to their architecture on a per game basis. NVidia’s driver is also Multi-Threaded, making use of the idling CPU cores in order to recompile/replace shaders. The work nVIDIA does in software, under DX11, is the work AMD do in Hardware, under DX12, with their Asynchronous Compute Engines.”

    And as for AMD’s underwhelming DX11 results, Mahigan claimed that this is mainly due to GCN’s architecture, as the graphics cards are limited by DX11’s 1-2 cores for the graphics pipeline.

    “But what about poor AMD DX11 performance? Simple. AMDs GCN 1.1/1.2 architecture is suited towards Parallelism. It requires the CPU to feed the graphics card work. This creates a CPU bottleneck, on AMD hardware, under DX11 and low resolutions (say 1080p and even 1600p for Fury-X), as DX11 is limited to 1-2 cores for the Graphics pipeline (which also needs to take care of AI, Physics etc). Replacing shaders or re-compiling shaders is not a solution for GCN 1.1/1.2 because AMDs Asynchronous Compute Engines are built to break down complex workloads into smaller, easier to work, workloads. The only way around this issue, if you want to maximize the use of all available compute resources under GCN 1.1/1.2, is to feed the GPU in Parallel… in comes in Mantle, Vulcan and Direct X 12.”

    This is definitely interesting and will make people understand why Ashes of the Singularity performs so well on AMD’s GPUs.

    Do note that a game’s draw calls are not its only bottleneck under DX12. Both 3D Mark’s DX12 benchmark and Ashes of Singularity use a lot of draw calls. However, a game may hit a Geometry or Rasterizer Operator bottleneck, in which case an NVIDIA GPU will outperform an AMD GPU.

    [​IMG]

    [​IMG]

    [​IMG]


    What ultimately this means is that NVIDIA will have to re-design its graphics cards in order to be able to handle more draw calls in parallel. A software solution sounds almost impossible at this stage, though NVIDIA’s engineers may come up with some interesting techniques to overcome this limitation. That, or some DX12 games may hit another bottleneck that may favour NVIDIA’s GPUs over AMD’s GPUs

    "People wondering why Nvidia is doing a bit better in DX11 than DX12. That's because Nvidia optimized their DX11 path in their drivers for Ashes of the Singularity. With DX12 there are no tangible driver optimizations because the Game Engine speaks almost directly to the Graphics Hardware. So none were made. Nvidia is at the mercy of the programmers talents as well as their own Maxwell architectures thread parallelism performance under DX12. The Devellopers programmed for thread parallelism in Ashes of the Singularity in order to be able to better draw all those objects on the screen. Therefore what were seeing with the Nvidia numbers is the Nvidia draw call bottleneck showing up under DX12. Nvidia works around this with its own optimizations in DX11 by prioritizing workloads and replacing shaders. Yes, the nVIDIA driver contains a compiler which re-compiles and replaces shaders which are not fine tuned to their architecture on a per game basis. NVidia's driver is also Multi-Threaded, making use of the idling CPU cores in order to recompile/replace shaders. The work nVIDIA does in software, under DX11, is the work AMD do in Hardware, under DX12, with their Asynchronous Compute Engines.

    But what about poor AMD DX11 performance? Simple. AMDs GCN 1.1/1.2 architecture is suited towards Parallelism. It requires the CPU to feed the graphics card work. This creates a CPU bottleneck, on AMD hardware, under DX11 and low resolutions (say 1080p and even 1600p for Fury-X), as DX11 is limited to 1-2 cores for the Graphics pipeline (which also needs to take care of AI, Physics etc). Replacing shaders or re-compiling shaders is not a solution for GCN 1.1/1.2 because AMDs Asynchronous Compute Engines are built to break down complex workloads into smaller, easier to work, workloads. The only way around this issue, if you want to maximize the use of all available compute resources under GCN 1.1/1.2, is to feed the GPU in Parallel... in comes in Mantle, Vulcan and Direct X 12.

    People wondering why Fury-X did so poorly in 1080p under DirectX 11 titles? That's your answer.


    A video which talks about Ashes of the Singularity in depth: https://www.youtube.com/watch?v=t9UACXikdR0


    PS. Don't count on better Direct X 12 drivers from nVIDIA. DirectX 12 is closer to Metal and it's all on the developer to make efficient use of both nVIDIA and AMDs architectures."


    -> Here is Overclock.net post -> http://www.overclock.net/t/1569897/...singularity-dx12-benchmarks/400#post_24321843

    So it will be Gains for nV Maxwell from New Drivers in DX12...
    So it's all true?

    Feel Free to join Disqusion Radeon/nV Users are welcome :)
    Keep up to topic and have great conversation.


    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
     
    Last edited: Aug 23, 2015
  2. DiceAir

    DiceAir Maha Guru

    Messages:
    1,369
    Likes Received:
    15
    GPU:
    Galax 980 ti HOF
    I was expecting amd to have much better boost than Nvidia. I can just see Nvidia users crying and say hey they will fix it but like you said Drivers won't do much as dx12 is closer to the metal than dx11 was. I been telling people that AMD will benefit greatly from this as their dx11 driver is way below par and now we get to see the true power of AMD cards. They then argued that hey Nvidia will benefit just as much as AMD but seeing as the drivers is not making full use of hardware we should see a bigger boost and might even do better than Nvidia. Well just as I though, AMD does have amazing cards it's just not being fully used.

    This test just makes me a little bit excited. i can't wait untill they do multi gpu tests. I really hope they utilize the integrated gpu + dedicated gpu(s) and hope they don't limit this to just single gpu configs. Imagine crossfire with integrated gpu doing less intensive stuff.

    I'm sure Pascal will be more dx12 focused as no point focusing on dx11 anymore. Only time will tell but once we get a proper game in dx12 we can only speculate.
     
  3. isidore

    isidore Guest

    Messages:
    6,276
    Likes Received:
    58
    GPU:
    RTX 2080TI GamingOC
    If AMD won't invest in gaming evolved games we won't see that performance boost. Think about it, it's all in the hands of the dev's now. An NVIDIA ready game will mean the engine will be more optimized for NVIDIA's architecture doesn't matter that the GCN architecture is more friendly to dx12. Ashes of the Singularity is a AMD game and we can see how awesome the amd hardware performs in dx12 or mantle.
     
  4. k3vst3r

    k3vst3r Ancient Guru

    Messages:
    3,702
    Likes Received:
    177
    GPU:
    KP3090
    Unless it's PC only game, dev's are going be programming for GCN since 3 consoles use AMD hardware PS4 an XBONE an WIU, cross ports Nvidia optimizations will be after thought, unless NV starts throwing cash incentives their way.
     

  5. Noisiv

    Noisiv Ancient Guru

    Messages:
    8,230
    Likes Received:
    1,494
    GPU:
    2070 Super
    Ashes of singularity DX12 test is a benchmark of an early beta.
    I would not read too much into results of a single game.

    And it's not like there is anything to write home about. That is except god-awful AMD DX11 performance:

    [​IMG]

    Early beta made by AMD sponsored company - Oxide.
    A company which does nothing lately but builds benchmarks in which AMD does good, and then moves to a next benchmark once Nvidia reaches parity.
    Remember Star Swarm/Mantle? And NV doing with DX11 supposedly impossible.

    TL;DR
    I would rather have great DX11 performance today, than promises about DX12 performance on ancient hardware one-two year from now.

    haha LOL
     
  6. OnnA

    OnnA Ancient Guru

    Messages:
    17,849
    Likes Received:
    6,739
    GPU:
    TiTan RTX Ampere UV
    THX to my friend i have Access to AoS -> So i have done 2 tests
    All Maxed in Game / noAA -> 1920:1440 85Hz (it's almost 2k ;-)
    Done only DX12 Part (for DX11 that Game is too much lol)
    Bench runs so smooth, just can't belive my Eyes :banana:

    CPU 3.97GHz FSB 248 (~4GHz) NB 2.75GHz HT 2.55GHz RAM ~2000MHz CL9 1T
    XFX R280X Black -> 1050/1600 1.181v +8% POW

    So YES i Agree that DX12 IS THE FUTURE OF GAMING on PC

    Normal Test:

    [​IMG]

    CPU Only Test:

    [​IMG]


    Also if somebody wanna comparison in Large DX11 RTS Game -> Total War Rome II is a Good example how well AMD does in that Game eg. R280X/290 = nV970 :bang:

    https://steamcommunity.com/app/214950/discussions/0/624076851292790527/

    http://forums.totalwar.com/showthread.php/149191-poor-performance

    I have in TWR II Solid +30FPS to 62FPS ;-) All Ultra/High noAA

    As always it depends on Dev. how well optimisation is made, if not Gimped is OK :)

    But when DX12 Game is developed its as it is -> Low API close to metal, devs. can't Gimp Game :banana:
    Look at: BatMan, pCars etc. IMO only


    Now Back to the topic :infinity:
     
    Last edited: Aug 23, 2015
  7. yasamoka

    yasamoka Ancient Guru

    Messages:
    4,875
    Likes Received:
    259
    GPU:
    Zotac RTX 3090
    You're definitely not very observant if that's all you see from these benchmarks.

    Ancient hardware - like the 980Ti and Fury X are ancient hardware. These cards both support DX12 and are going to remain high-end well into DX12's debut at the end of this year.

    So you have absolutely no counter-argument or reply to the OP which explains the differences we see? Yeah, next.

    Repeating the same garbage over and over again.
     
    Last edited: Aug 23, 2015
  8. OnnA

    OnnA Ancient Guru

    Messages:
    17,849
    Likes Received:
    6,739
    GPU:
    TiTan RTX Ampere UV
    Hi, You know i remember times before DX10/11 when was 1 king DX9c -> People like you said same thing.
    We don't need DX10 then after couple of years DX11 came -> we have to buy new H/W etc.
    Game will never use this tesselation like in Heaven Benchmark etc.
    And Look now ;-) AC:Unity for example: every thing is Tesselated, and every game has Tess. and any AAA game looks way better that early benchmarks of DX11 :bang:

    So Now we have DX12 in our Home -> YES? YES !!
    And we have Games in Production for DX12: YES? YES !!

    Here, and list will be longer... over time

    DeusEx,
    NFS,
    SW BF3,
    Mirrors Edge2,
    Tomb Raider,
    HitMan,
    AoS,
    GRIP,
    Fable Legends
    Mass Effect: Andromeda

    for 100% BF5 ;-)

    [ Witcher 3, ARK, and many more will have update to DX12 in the future]

    So i don't agree with your opinion, you have your point of view, but that point will change over time :)
     
    Last edited: Aug 23, 2015
  9. pharma

    pharma Ancient Guru

    Messages:
    2,485
    Likes Received:
    1,180
    GPU:
    Asus Strix GTX 1080
    Basically a DX12 gaming engine benchmark, not a DX12 benchmark. The poor performance shown DX 11 vs DX12 points to a very poorly optimized engine for Nvidia hardware. We have already seen 20% increases when running under DX12 from the Fable Legends demo. If we do not see similar decrease for DX11 vs DX12 in Fable Legends when released in Oct. 2015 then it's bad news for Oxide/AMD partnership.
     
  10. Undying

    Undying Ancient Guru

    Messages:
    25,333
    Likes Received:
    12,743
    GPU:
    XFX RX6800XT 16GB
    Is the demo/test free to download? I'm willing to try it, interested in resoults.
     

  11. k3vst3r

    k3vst3r Ancient Guru

    Messages:
    3,702
    Likes Received:
    177
    GPU:
    KP3090
    Just had quick look apparently you have become a founder....which costs $44.99
     
  12. pharma

    pharma Ancient Guru

    Messages:
    2,485
    Likes Received:
    1,180
    GPU:
    Asus Strix GTX 1080
    No. $45....:)
     
  13. Redemption80

    Redemption80 Guest

    Messages:
    18,491
    Likes Received:
    267
    GPU:
    GALAX 970/ASUS 970
    Personally I think AMD owners are yet again setting themselves up for future disappointment.

    When actual games appear and things don't match this test then it will be thread after thread of complaints and likely crying foul.
     
  14. Noisiv

    Noisiv Ancient Guru

    Messages:
    8,230
    Likes Received:
    1,494
    GPU:
    2070 Super
    390X edges out GTX 980. Fury X edges out 980 Ti. That's not something out of the realm of possibilities even in DX11(!)
    I would gladly trade 5% of future DX12 perf. on my 290 for 50% of DX11 performance today(!)
    (as shown in that pcper image)
    The only reason I am not buying new gfx card today is 290 still runs great @1080p, and Pascal/AMD counterpart will make it obsolete in 6 months or so. 2 node jumps = totally obsolete.

    We've seen this rhetoric when AMD won both consoles, and all that supposed AMD advantage turned out to - ONE BIG NOTHING.

    But this is even more crazy.
    People equating early benchmarks of an alpha(!) (not even Beta lol) done by AMD sponsored Oxide with future DX12 overall performance.
    I'm sorry but that's just :bonk:
     
    Last edited: Aug 23, 2015
  15. Noisiv

    Noisiv Ancient Guru

    Messages:
    8,230
    Likes Received:
    1,494
    GPU:
    2070 Super
    No.

    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]


    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]


    [​IMG]
     

  16. Ungeheuer97

    Ungeheuer97 Active Member

    Messages:
    60
    Likes Received:
    0
    GPU:
    GTX 1070 G1 Gaming
    A noobish question probably, but what does this all mean for AMD's GCN 1.0 architecture.
    Also, according to Game Debate the Sapphire R9 270 Dual-X 2GB is based on the Curacao Pro GCN 1.1 architecture, which I doubt is true. Could it be just an error on their part?
     
    Last edited: Aug 23, 2015
  17. ---TK---

    ---TK--- Guest

    Messages:
    22,104
    Likes Received:
    3
    GPU:
    2x 980Ti Gaming 1430/7296
  18. theoneofgod

    theoneofgod Ancient Guru

    Messages:
    4,677
    Likes Received:
    287
    GPU:
    RX 580 8GB
    It means good things for GCN 1.0
    The 280x is GCN 1.0 too.
     
  19. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    15,693
    Likes Received:
    9,572
    GPU:
    4090@H2O
    ...and...

     
  20. Anarion

    Anarion Ancient Guru

    Messages:
    13,599
    Likes Received:
    386
    GPU:
    GeForce RTX 3060 Ti

Share This Page