Radeon R9 fury and lost sanity ( help please )

Discussion in 'Videocards - AMD Radeon' started by Baron_of_Hell, Feb 8, 2019.

  1. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Hello everyone

    Since the moment i assembled my pc 5 years ago i ran into quite a few problems with my r9 fury, insane texture streaming issues and z-buffer issues in almost every 3d environment, coupled with constant ctds and random shutdowns.

    solution that worked for me ( originaly ) was to use 15.7.1 catalyst drivers with a semi-deleted drivers from 17.11.2 crimson, on top of forcing every registry line with EnableUlps into 0, simultaneously launching clockblocker, and msi afterburner undervolting and underclocking only to get it work ( what was funny though, is that benchmarks ran perfectly fine and presented 0 errors ). And after roughly 3 monts of rectal intercourse with my pc i was finaly able to use adobe premiere / d3d11 games / 3d aplications without crashing to desktop for longer than 15 minutes up to 8 hours and more when needed. Though recently i had to replace my fans on the gpu.

    Recently i got this new issue which is basically what i had when i first assembled my pc. After a while (8 - 15 minutes) 90% of 3d aplications crash and if i check log it just says: Display driver amdkmdap stopped responding and has successfully recovered.

    And also a system freeze if you're unlucky ( but the sound is fine and i can hear the stuff is running or even use skype/discord, so i thought it's a driver issue ).

    MSI afterburner shows normal temperature never going above 83 degrees.
    One thing i noticed is that crashes frequently occur when gpu load has a sudden spike to 100% then back to 0% then again 100% and then display driver crashes and restarts. Voltage goes spiking down then back up too ( though it begins with spiking down for like 2.5 seconds ).

    It's 100% not a temperature issue cause benchmarks still do perfectly fine and besides temperature never goes above 85.

    Next on the line was voltage - i tried to undervolt - no result, tried to crank up voltage - no result.

    Reduced core clock from 1000 to 800 - no result.

    It looked to me like it's a ULPS issue or one of those amd power saving tricks, but i thought i disabled it in register, and clockblocker is running too which is supposed to prevent underclocking that amd loves so much ( for reasons that elude me, even though every friend i know has issues with underclocking ). After like tinkering with settings for 100 hours i get stuff to work, and i have no crashes in 3d apps untill i shutdown my pc, then it's crashtown all over again.

    I thought it might be my PCU, but why on earth does it work after i dance around it for 20 hours then?

    Thought it's faulty ram - still nothing.

    Asus Z170 Pro Gaming
    R9 fury 4Gb HBM with Strix cooling
    Core i7 6700 OEM 3.4GHz
    DIMM DDR4 16 Gb ( 4 x 4Gb ) Kingston HyperX Fury Black 2133 MHz
    Windows 7 Ultimate
    PCU: Thermaltake TP-850ah5ceg-a 850w
    WD2003FZEX Western digital black 2 Tb 3.5" 7200 RPM

    Hopefully someone could help me solve this mystery, or at least tell me where to look for a solution.
     
    Last edited: Feb 8, 2019
  2. MerolaC

    MerolaC Ancient Guru

    Messages:
    4,370
    Likes Received:
    1,082
    GPU:
    AsRock RX 6700XT
    I don't have anything to help you solve the issue.
    My question is, you never tried to RMA the card? That's the first thing I would have done.
     
    BlackZero likes this.
  3. BlackZero

    BlackZero Guest

    To eliminate software, the first thing I would do is change to Windows 10 x64.
     
    MerolaC likes this.
  4. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Well it has been over 5 years since purchace and proving that it's faulty from the get go is a juristical hell that would cost me 5 nuclear submarines.

    Is there any kind of diagnostics tool that would carry it's stats over through the crash? So that i could spot at least if it's the PCU or GPU or drivers or whatever the hell this thing is.
     

  5. G13Homi

    G13Homi Member Guru

    Messages:
    112
    Likes Received:
    37
    GPU:
    RTX 4090
    @OnnA @Fox2232 i remember you guys getting my fury funk cured. Anythoughts here.
     
  6. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    well, thing is, back when i was first assembling my pc, I tried both windows 10 and windows 7. Windows 10 just had an awful performance overall, and the more i hear from my friends - no one is happy about their choice with windows 10.

    And back then Windows 10 had even more problems with my rig, more lag spikes and even more graphical issues, along with random weird software issues like just deleting parts of drivers ( and i'm talking about the official licensed copy, surprisingly, pirated copy had less issues but was a far cry from windows 7 performance which just crashed without deleting stuff on my pc ).

    That was actually the reason i looked at windows 7 in the first place. I might not have if 10 worked fine
     
  7. user1

    user1 Ancient Guru

    Messages:
    2,746
    Likes Received:
    1,279
    GPU:
    Mi25/IGP
    You need to do a fresh install of windows, to make sure it isn't a software problem as mentioned, a pain but neccessary, otherwise you will only continue to fight an unstable system,wasting your time, without having an good idea as to why.
     
    Undying likes this.
  8. Agonist

    Agonist Ancient Guru

    Messages:
    4,284
    Likes Received:
    1,312
    GPU:
    XFX 7900xtx Black
    How have you had the Fury for 5 years? They did not launch until June 2015. Not even 4 years old yet....

    Win 10 runs fantastic for me, and all of my computers with it, even my trusty old q6600 rig that is 11 years old.

    Ive had a fury and even in a crappy airflow case, it never went over 63c with a custom fan curve without undervolting. You have a dud card if anything if its not PSU related, windows related or driver related.
    Simple as that. If the gpu does it in another rig, its the card sadly.
     
  9. OnnA

    OnnA Ancient Guru

    Messages:
    17,846
    Likes Received:
    6,739
    GPU:
    TiTan RTX Ampere UV
    IMO You should try WinX 64 Pro
    I don't have any problems with my Fiji back in the days ;)
     
  10. Lavcat

    Lavcat Master Guru

    Messages:
    552
    Likes Received:
    44
    GPU:
    Radeon 7900 XTX
    Not a Fury, I'm running a Nano. Never had any problems with it.
     

  11. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    I would say it is in power delivery. Few points:
    - 83/85°C is not fine for Fury(X) because you have HBM right next to it using same heatsink. (In worst case scenario HBM does not even touches heatsink.)
    - Crash in "load spike"/"clock spike" hints that power delivery does not react fast enough and is not able to provide sufficient voltage in that spike. (Clock blocker/static clock "fix".)
    - Been OKish, stopped being OKish (Physical manipulation of some component. Maybe heatsink moved?)
    - From tests PSU looks solid

    1st, I would check how well plugged are cables in between PSU and Graphics card. If OK, then I would inspect actual pins on those cables used. If OK then I would try to use different sockets on PSU as it has 4. (Btw, are you using 2 separate cables or one split at the end?)

    2nd, did you ever took heatsink down or wiggled it a lot? (fan replacement mentioned, and thermals are way too high)
    - Even experienced people managed to do some (irreparable) damage to HBM traces on interposer.
    - Did you apply TIM on GPU and HBMs?

    3rd, are you using original vBIOS or some modded one for any reason?
    - not saying it is bad, but we should be aware since unlocking bad shaders may or may not cause this.

    4th, do you have errors in case you run memtestCL on GPU? (There is broken version floating around which shows millions of errors. So in case way too crazy count is found immediately, look for fixed version.)

    Here is some nightmarish image of what may be in between heatsink and GPU/HBM on your card.
     
    Last edited: Feb 9, 2019
    G13Homi likes this.
  12. z8373767

    z8373767 Master Guru

    Messages:
    465
    Likes Received:
    220
    GPU:
    6900XT/8650G+7970M
    For me, since some 18.x.x drivers every monitoring program can't properly show GPU usage, even Radeon Overlay. Yeah it goes from 0 to 100% and back to 0%, but performance is fine.
    You need to lower temperature. It's really high. My card goes throttling when reach 80°C, which is very easy on default, because Sapphire Intelligent Fan Control is stupid as..., run only central fan.
     
    G13Homi likes this.
  13. The_Amazing_X

    The_Amazing_X Master Guru

    Messages:
    708
    Likes Received:
    233
    GPU:
    Red Devil V64
    Use HWInfo to show proper GPU Stats
     
  14. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Funny thing is I actually did change TIM on GPU, and i did remove the cooling to do that and that special gpu paste was kinda solidish-whacky looking but not as bad as it's shown in the picture, but why wouldn't benchmarks show the issue then? I might have wiggled it a bit too. Sockets are sound idea, didn't think of that one. 2 seperate cables that split into 6 and 2 at the end.

    I'm not using modded vBIOS, not that i know of one at least, though i did thought about starting to.

    wires seem to be solid enough ( not that i have voltmeeter lying around to check ), they hold pretty firmly

    Did memtestCL, Video Memory Stress Test - 0 errors on both, used that FurMark Gpu stress test for over an hour top temperature 82, average 79, core clock around 500 - 650 memory clock 500, stable 82 fps for over an hour with no issues whatsoever.
     
    Last edited: Feb 9, 2019
  15. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM

  16. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Yea well, thing is it's sudden, usually it just goes like 50-70-90 then it keeps at around 99 for like 5 seconds drops to 98 for 1 second and then back to 99. It begins to spike only when it's about to get itself crashed and it happends in random time intervals for like a split second things look normal, then out of the blue spike to 0 then back to 100 then back to 0 and bam - crash, i can play or do my job for 20 minutes and it's fine, then it crashes, or it crashes at start, or after like 5-6 minutes.
     
  17. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Actualy it's even less, it's been out for like 3.5 years, doh well seems like time flows really slow for me oops. Well I wanted to use win 10 from the beggining and like i said i had a lot of performance issues with it, maybe things have changed since then but back when i tried it I couldn't get it to run at all, neither licensed copy nor pirated. About the card well... most of my friends either have notebooks or imacs sooooo... it's kinda hard to just bust at their place and ask them for a spare PCU or ask them to shuffle gpus to check them.

    Besides right now it's going to be pretty difficult to just reinstall the system
     
  18. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    tried to record what's happening during crash with MSI afterburner, was succesful to an extent:

    https://imgur.com/lYYaTSP

    everything looks kinda fine: voltage seems fine it's ( Энергопотребление ЦП ), when it crashes it goes to just 32. But as you can see, GPU load goes for steady 100% then for some weird reason it just drops down to 0 and just crashes the entire thing. There was a small temperature spike but why on earth does it drop it's gpu load to 0? Especially when voltage during the event was at 39??? Maybe my MSI afterburner feeds me wrong info but i'm not sure myself but it doesn't look like it's the voltage issue.

    Maybe it's drivers fault? I have no idea at this point
     
  19. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    unstable GPU => driver reset => 0% utilization => lower temperature

    That "Voltage" of 32 is magical number. What is its unit? 1/30th of Volt?

    Even if I ignore whole bad units in Voltage... It is clear that As temperature goes up, Voltage goes down. Graphs are crystal clear about it.
     
  20. Baron_of_Hell

    Baron_of_Hell Guest

    Messages:
    9
    Likes Received:
    0
    GPU:
    R9 Fury 4 Gb HBM
    Power consumption is shown in Watts if i'm reading this right, any way to check what is going on with power? Or it's an overheating issue? Found a way to record GP voltage, maybe that'll do the trick

    But then again, ok, if it's an overheating issue, why didn't it crash on a benchmark?
     

Share This Page