High DX11 CPU overhead, very low performance.

Discussion in 'Videocards - AMD Radeon Drivers Section' started by PrMinisterGR, May 4, 2015.

  1. theoneofgod

    theoneofgod Ancient Guru

    Messages:
    4,677
    Likes Received:
    287
    GPU:
    RX 580 8GB
    It's self-explanatory in a way :)
     
  2. sammarbella

    sammarbella Guest

    Messages:
    3,929
    Likes Received:
    178
    GPU:
    290X Lightning CFX (H2O)
    Today or six month ago? :D
     
  3. DerSchniffles

    DerSchniffles Ancient Guru

    Messages:
    1,665
    Likes Received:
    148
    GPU:
    MSI 3080Ti
    Running fantastic here. Turning down texture quality from hyper to ultra or whatever fixed all stuttering. Seems to be the case for a lot of people. Give it a go.
     
  4. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,961
    GPU:
    XFX 7900XTX M'310
    Last edited: Jan 3, 2017

  5. Seren

    Seren Guest

    Messages:
    297
    Likes Received:
    16
    GPU:
    Asus Strix RX570
    Other than much better shadows and more objects, I can't really notice much difference, especially with ultra->hyper texture quality.
    Though:
    I guess it helps the lighting which I didn't expect.
     
  6. Garwinski

    Garwinski Member Guru

    Messages:
    185
    Likes Received:
    4
    GPU:
    XFX Fury X
    Question: Is overhead in newer AMD cards compared to older AMD cards (for example 7970/280x vs rx480) less? Maybe some hardware specific things, or better targeted overhead optimization in the drivers for newer cards? Has anyone ever measured this? I remember very vaguely something about this, but I cant find it anymore.

    I am curious because I have a fx9590 which in some games poses a bottleneck. Would that improve (even slightly) from going from my 7990 to a 480 for example?
     
  7. OnnA

    OnnA Ancient Guru

    Messages:
    17,845
    Likes Received:
    6,739
    GPU:
    TiTan RTX Ampere UV
    Best for you is to build a RIG based on RyZEN 6/12 on X370 :)
    I will do that ;)

    Truth is: PhenomII is too old and new FX are not so good in some games also.
    Best remedy for it is to build a AMD ZEN RIG.
     
  8. xacid0

    xacid0 Guest

    Messages:
    443
    Likes Received:
    3
    GPU:
    Zotac GTX980Ti AMP! Omega
    You can see this from 480 release day review. 480 beating/matching Fury in CPU intensive games. Check TPU.
     
  9. Garwinski

    Garwinski Member Guru

    Messages:
    185
    Likes Received:
    4
    GPU:
    XFX Fury X
    Ah great suggestion. I just looked at some release reviews. The one from TPU indeed shows a massive difference between fury and 480 cards in CPU-heavy games like Fallout 4. But with other games I dont see it that much, at least not enough to indicate that overhead is really that much better. Could this not be mostly because of tessellation improvements in the architecture of Polaris?

    An ultimate answer would probably someone getting results from the overhead test in 3Dmark with his old card and then with a RX480, but if no one has those results handy, it would be quite a hassle.

    When looking at benchmarks, I always get remembered about the advantages of crossfire working well. If I would go from a 7990 to a 480, I would, in the case of games like The Witcher 3 and GTA V, lose FPS because those games have crossfire working (very well). In games with it nót working, I will gain fps (never mind the great psychological relief of not having to wait for crossfire support, if we get it at all, applying DIY-solutions, graphics mods not working with crossfire, crossfire breaking in some games with new drivers etc.). The eternal struggle of weighing ones options I guess.

    Yea, this would also be a solution to get rid of money I dont have :infinity: Really, if I had the money I would go for ZEN and get the best single GPU available. But my fx9590 does great in a vast majority of games, especially newer games which make use of multiple cores more often. I can play practically every game on ultra 60fps (if crossfire works though: I am practically always gpu-bottlenecked when crossfire does not work), except for games that heavily rely on one core like Fallout 4 (in which I am actually still gpu bottlenecked because crossfire does not work for me) or Attila: Total War and early-access game SQUAD (so improvements could be made in the future). If I buy a new GPU because I want to get rid of the very inconsistent support for crossfire and thus suffer a lot from a GPU bottleneck in more recent games, I can still take that GPU with me to an eventual new build in the future. Getting rid of some overhead while upgrading my GPU would be a nice extra :)
     
    Last edited: Feb 18, 2017
  10. Oxezz

    Oxezz Member

    Messages:
    28
    Likes Received:
    0
    GPU:
    R9 290 VaporX @1150
    Going from 7990 to RX480 ? hell no (yeah i get that CF is pain in the ass sometimes but rly doesn't justify the purchase of 480)
    [​IMG]
    The only thing you can do right now is wait for vega (if-thinking for new card) or for more driver improvements.
    Overheard still exists and always will, at least we can hope for those minor improvements. I 'member when i bought my r9 290 i was barely hitting 900k drawcalls ST in 3DMark and now im about 1,3+million AMD has come long way ...or 3dmark lel.
     
    Last edited: Feb 18, 2017

  11. Spartan

    Spartan Guest

    Messages:
    676
    Likes Received:
    2
    GPU:
    R9 290 PCS+
    I was in a similar boat last year with my oc'd fx8350 and r9 290 and I chose to upgrade my cpu. It is a crap feeling when you have to spend money to upgrade a "useless" part of your pc (for gaming) instead of buying a new shiny GPU, but I did not regret it. FX was nothing more but struggle for me, with my current cpu games are just running. Except when AMD is gimping their OGL drivers... Anyway, if you want to keep that FX I strongly recommend to you to buy Nvidia gpu, amd cpus tend to perform better in cpu bound games with them.

    It's not just about drawcalls, when I changed my cpu, got 900k on oc'd fx and 1M on stock i5 on the same day with the same RAM sticks at 1600MHz, but in GTAV difference was average ~45 fps vs locked 60 fps, same settings. It is nice that ReLive drivers are better in regard of drawcalls, unfortunately they can't improve IPC.
     
  12. Garwinski

    Garwinski Member Guru

    Messages:
    185
    Likes Received:
    4
    GPU:
    XFX Fury X
    Yea I will, if I change my GPU at all, wait for vega anyway. And indeed, drivers have improved a lot. In total war games or a CPU heavy game like Dying Light there has been a very noticeable improvement in performance over the time that AMD started putting effort in reducing CPU overhead (I believe this started with Windows 10), so one can hope for the future. The re-enabling of asynchronous shaders for GCN 1.0 with the new drivers bodes well as well for the future of my card, as well as excellent PC versions of games like Rise of The Tomb Raider and Hitman with both on dx11 and dx12 excellent support for two GPUs :)

    The 3GB Vram is the most limiting factor atm for my card I think, but setting textures to high instead of ultra, I can live with that. With wider adaption of dx12 and vulkan lessening my cpu bottleneck in a lot of cases, and experience with dx12 increasing with more time passing, resulting in developers tipping their toes in 'extras' like multi-gpu support, I guess I can survive a while.
     
  13. user1

    user1 Ancient Guru

    Messages:
    2,746
    Likes Received:
    1,279
    GPU:
    Mi25/IGP
    I have a question, has anyone tested the overhead difference between windows 10 and windows 7 on Gcn cards?, i found an odd difference when comparing win 7 to windows 10 on both the earlier 15.201 drivers and the post crimson drivers.

    I found that draw call performance particularly with the 15.201 drivers to be significantly worse on windows 10, windows 7 was clocking in around 566k~ and windows 10 was around 430k~, the 15.301/16.101 driver improved the situation with windows 7 clocking in around 604k and 540k~ on windows 10 but still a gap remains.

    Would be Curious to see if the trend continues on gcn cards.
     
  14. vase

    vase Guest

    Messages:
    1,652
    Likes Received:
    2
    GPU:
    -
    I can confirm lower performance in windows 10 with my GCN card across many titles.
    And the funny thing is/was: The Win10 was a fresh install basically, just to test out the build after the anniversary bullsh...
    And the Win7 is ... lemme check install date...26.03.2015
    Not saying I have been bad in terms of keeping it clean. But still. W10 fresh install. -> Performance is lower. Not subjectively, but when measuring.

    But I said that all along...
     
  15. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,125
    Likes Received:
    969
    GPU:
    Inno3D RTX 3090
    I actually had the opposite with widows 10. Vase can you give some more specifics for me to see what I can dig?
     

  16. vase

    vase Guest

    Messages:
    1,652
    Likes Received:
    2
    GPU:
    -
    Yes. I haven't got it installed because I was testing something with opnsense on my other partition, but as soon as I got that done, I can put the most recent back on and do some comparison

    As far as I remember one of the things that got a better result in W10 by like 50 pointes (reproducable) was DX11 3DMark FS. But games I can't recall the ones that ran worse and the ones that ran rather equal.

    One main issue I had with Anniversary and many other builds were "Driver says goodbye" ... I had these en masse. On 7 I didn't have a driver stop/restart since some 15.xx drivers.

    But yeah, maybe I can refresh my memory on what games that were.


    Edit:

    so i just "quickly" installed W10 and tried out some GW2 and some BF1.
    BF1 ran ok as far as I can tell, but I will have to make a more thorough test in terms of performance comparison.
    But with GW2 I ran into an issue that basically sums up what my experience (apart from lower performance in some cases) has been with W10 also in other games before ... -> constant display driver crashes



    [​IMG]


    And this time it actually seems to have gotten worse because as you can see from the log, it doesn't even recover (Tdr) anymore but the system just resets. Making it prone to data errors / disk errors ...
    I don't like that one bit. And I'll stay on 7 as long as WDDM 2.x is ripe for all application scenarios.
     
    Last edited: Feb 22, 2017
  17. MatrixNetrunner

    MatrixNetrunner Guest

    Messages:
    125
    Likes Received:
    0
    GPU:
    Powercolor PCS+ R9 270X
    There is an interesting thread about draw calls over at AnandTech forums. The thread was started to compare draw call performance before Ryzen launch.

    [Part 2] Measuring CPU Draw Call Performance

    The Stilt got 12.8 fps on Ryzen R7 1800X/Fury Nano/Win10, while Monk5217 got around 18 fps on i7 4771/R9 Nano/Win10. The Haswell i7 is slightly faster (less than 10%), so this result indicates there is a problem somewhere for Ryzen.

    Then The Stilt got around 14 fps for the same test on the same config, just running Windows 7: Ryzen: Strictly technical

    It would be very interesting to compare results of the API Overhead Test of Ryzen CPUs vs Haswell/Broadwell CPUs. There seems to be a significant difference in draw call performance between comparable AMD and Intel processors (in this very un-scientific test).
     
  18. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,125
    Likes Received:
    969
    GPU:
    Inno3D RTX 3090
    It honestly sounds like OS driver and scheduler issues. Most people have no experience with a more free flowing Linux distro like Gentoo or Arch, and they don't realize how much the CPU driver matters for performance.
     
  19. janos666

    janos666 Ancient Guru

    Messages:
    1,648
    Likes Received:
    405
    GPU:
    MSI RTX3080 10Gb
    To be even more honest, the fine details might be even more obscure than that.

    I have a few years of experience with Gentoo (I use it for my headless home-server to do two-way WAN traffic shaping, provide SMB storage for my LAN on top of multi-dev ZFS and more recently multi-dev Btrfs, etc).
    I have a manually tailored kernel config but I am not entirely sure what CPU scheduler I am using for the little Pentium G4400 CPU.
    For one, I deliberately enabled the "Intel hardware P-states" driver which (as much as I understand it) also replaces the frequency governor (besides the generic APCI P-states driver).
    But the scheduler? No idea, really... That's a "black box" for me. I can't describe it better than "The nameless default mainline CPU scheduler which I have zero knowledge about. It just works.".

    I "overheard" a quick conversation once. Some random dude (who worked on some university project) asked a talented man to explain the CPU scheduler of the modern Linux kernels.
    The answer was short and very elusive: NUMA
    The guy was just as confused as me. He asked for some clarification. I didn't want to intervene (I just happened to stand there, I wasn't supposed to participate), so didn't press the issue but this was the answer (roughly quoting from memory): cache/memory locality is so important for multi-socket systems and even some multi-core chips, that it must play an important role in the CPU scheduler's decisions.
    The conversation ended there. If it was for me, I would have pressed the issue further but the original questioner gave up, and thus did I.

    See, the problem is I have always been compiling my kernel without NUMA support because I never had multi-socket platforms at home, nor I ever ran Linux on single-socket CPUs utilizing multiple memory controllers (like those old quad-core CPUs where Intel just shoveled two separate dual-core CPUs behind a single lid and called them quad-core). So, how could I be using the NUMA scheduler? And regardless, there must still be something else in the line after NUMA played it's part (if any).

    Well, to be hones, I think these 8/16 Zen chips might not be so much different from those old "dual-dual-core" quads of old times (Q9600, was it...?). Even if the chip itself is monolithic (in terms of being a single piece of die), it seems to consist of two separate 4-core "modules". Although the memory controller seems to be shared (if I am not mistaken). So, I guess it could use something like the NUMA-based scheduling (but might require some specific tuning).
    The optimal solution might be a little more complicated than existing NUMA (I guess it should be -roughly speking- treated like 2x4 independent cores with 2x4 extra hyper-threads on top, scheduled depending on how much cores/threads you saturated already with how many independent processes and it won't always be "perfect" because there won't be an ideal solution for every random scenario).

    To be honest, I think, even though AMD managed to increase the instruction/clock/core tremendously and it's a nice thing they managed to shovel these 4-core modules together so elegantly (given the circumstances), it's still "yet another modular madness", almost like the Bulldozers (not really, just almost, but still...). And it's not just about the half-speed AVX but all these special scheduling problems...
    I think, even after OS schedulers improve things, 8-core Zens will have a somewhat "random" performance for mixed random workloads. Whenever you try to saturate 4<x<8 cores, the scheduling might end up sub-optimal (because the scheduler code has to be kept as simple as possible whereas this design could need complex scheduling logic - but that could impose notable overhead even when it's not needed and all you can do in real-time is predicting but not precisely foreseeing the future...). So, all in all, it's a nice achievement but some trade-off definitely seems to be there. Zen is an average woman dressed up as a queen (but then Bulldozer was a wh¤re dressed up as a housewife, so...). Thus, all in all, it didn't manage to live up to it's hype in my eyes (I currently have to treat it like it would always show it's worst-case performance).
     
    Last edited: Mar 12, 2017
  20. trocio2

    trocio2 Guest

    Messages:
    484
    Likes Received:
    0
    GPU:
    GT 630 1GB DDR3 GK208 Kep
    What's the conclusion (2017)? Amd drivers used more cpu vs nvidia ones in dx11, is it fixed?
     

Share This Page