Help with Rx Vega 64 LC crash/problem.

Discussion in 'Videocards - AMD Radeon' started by MaestroBayten, Oct 18, 2018.

  1. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    Ah...Im sorry i misunderstood^^

    Yeah its kinda insane how much FPS i gain now. :D
     
    OnnA and Embra like this.
  2. Embra

    Embra Ancient Guru

    Messages:
    1,601
    Likes Received:
    956
    GPU:
    Red Devil 6950 XT
    Awesome.
    I run my 1700x @4.0.
    Had the Vega for a month now on a 32 inch 1440p monitor.
    It;s been a perfect fit for me. :)
     
    MaestroBayten likes this.
  3. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    For now my 1600x is stable at 3.9Ghz i will try to go further :p. Maybe ill try 4.0-4.2Ghz.
    For me 27" 144hz since 60hz is to low if i play shooter.
     
  4. seansplayin

    seansplayin Guest

    Messages:
    2
    Likes Received:
    0
    GPU:
    Sapphire Vega 64 N+
    sweet I found some Vega experts. I recently picked up a Sapphire Vega 64 Nitro+ 100410NT+SR which uses a different PCB than most Vega cards. To water cool it I ordered a bykski water block from Alibaba which was made specifically for this card. While I think I have pretty good performance I always want more.

    My Question, is there anything to gain by going with the liquid bios and would it even be safe to try considering this card uses a different PCB with different chokes compared to what I've seen the other Vega's use?
    I have loaded a Power play table in Windows (still haven't figured out linux yet) that raises the power limit to 142% and GPU core current to 400amps. After a few days of overclocking my top performance profile pulls in 390watts with a 32c core temp doing stress tests and benchmarks. https://www.3dmark.com/spy/4858202
    https://beta-static.photobucket.com/images/ad280/seansplayin/0/d795e6de-280d-4602-b2d6-a7a8f137c0d8-original.jpg?width=1920&height=1080&fit=bounds
     

  5. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,962
    GPU:
    XFX 7900XTX M'310
    The liquid bios has a higher boost clock by default I believe and should be using a bit more voltage with the thermal threshold reduced but that's something you can set up in Wattman or other overclock utilities for the most part and without risk of using a bios that might not be optimal for the GPU particularly the non-standard PCB Vega models.

    Sounds like you're getting a lot out of the GPU already from raising the power limit higher and keeping the GPU core cool thus mostly removing the throttle conditions for it hitting the thermal and power draw limits allowing a higher sustained boost clock with only the total workload being a factor in how it scales. :)
    (Anything below 100% and it starts dropping from p7 to p6 though it's usually closer to p7 or above if it can scale higher even if it's not at 100% all the time.)

    Unless I'm overlooking something I don't think you'd get that much more from using the liquid bios, profile might be a bit more aggressive and the water cooling will help keeping the core and memory clocks down below the threshold where the GPU will shut down although with the limits mostly dealt with from the power play table and keeping the core nice and cooled it should be scaling up nicely already without it.

    Wattman or any other preferred overclocking tool compatible with AMD's newer drivers and Vega GPU's should work well for fine tuning most of the settings too although having it in the bios is definitively convenient but I don't think the stock Vega 64 bios is entirely compatible with the custom models though from reading other users experiences with it this seems to vary a bit.


    It's unfortunate that editing the bios isn't possible or that would be one way to carry over the changes properly but the check the GPU has prevents anything other than a stock bios from working though the tools for viewing what the bios values are set to might also be useful for comparing the GPU's bios against that of the liquid cooled stock Vega 64 :)

    There should be a few threads here about the Vega bios and utilities such as the bios viewer/editor even if the GPU's will not load modified bioses itself due to this check.
    https://www.overclock.net/forum/67-amd/


    EDIT: It's probably nothing new but the Vega GPU looks to be operating on a number of different parameters defining how much it can scale with Vega 64 having higher settings than Vega 56 and Vega 64 liquid having a bit above that.
    (So p6 for 3d clock speeds and p7 for something like the boost range but it can also be exceeded if nothing is limiting the GPU.)

    But by keeping it to automatic settings and only increasing power load while still keeping the GPU nice and cool and having enough power to supply to the card then that should work just as well, downvolting and managing clock speeds manually is one method but allowing the GPU to draw as much power as it can and scale from that is also a way and if temps are under control and the drivers aren't hitting any spikes or other oddities pushing values into something unstable then that gives nice results too. (It should also regulate voltage as needed up to 1.250mv I think is where it caps.)
    (And with the core and memory well below 65 degrees Celsius where the HBM2 modules start loosing up the timings first performance should be just fine too in that regard with no sudden drops.)
     
    Last edited: Nov 2, 2018
    Embra likes this.
  6. seansplayin

    seansplayin Guest

    Messages:
    2
    Likes Received:
    0
    GPU:
    Sapphire Vega 64 N+
    thanks for the reply. I had no idea Vega could/would loosen the HBM timings on the fly, that's crazy. I was pretty upset to learn Bios was locked and I've thrown some hate AMD's way for doing it, usually I dial in OC in Windows, flash those setting to Bios then I get my performance in Linux. Removing the power limits basically helps stabilize the clocks and keep it in the 1700-1725mhz range with occasional boosts up to 1755mhz. I was a little disappointed that putting it on water only yielded +25mhz to the HBM clock. on air with HBM temp of 55c I could run at 1090mhz with no artifacts, and now I can run at 1115mhz and HBM temperature stays below 36c. Also it only pulls 390 watts when running Furmark, typical gaming it uses around 350 watts, core is -12mv UV. Wish I knew it was Safe to flash Liquid bios, gaming on Linux is kind of my thing now thanks to Valve. thank you for the reply, I'll give the forum you linked a good read.
     
  7. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,962
    GPU:
    XFX 7900XTX M'310
    The open source drivers and AMD's many commits to the core of the GPU driver support (The "DRM" I think.) has also really improved things on Linux, OpenGL is way ahead of Windows and competes with NVIDIA in terms of performance and Vulkan should be ahead too now even supporting 1.1.85 and newer whereas the current Windows driver is still catching up a bit and it's not exactly easy to implement or merge the code because it's significantly different. :)

    Support for adjusting GPU parameters should also be possible now last I heard and thanks to Proton from Valve or just DXVK and via WINE there's 1500+ titles now that are completely playable and more are added frequently.
    Newer improvements in Vulkan has also improved things for more recent titles using newer features from D3D11.1 and on which previously had graphical glitches or were unstable and the open source community ensures a steady stream of improvements.


    Vega as a GPU architecture is also quite different in how it manages clock speeds and more fluid with parameters fluctuating up and down from a number of different conditions, I was curious as to why AMD made a big deal of 18.10.1 and the new driver branch but the recent posts confirming far better scaling and balance for voltage and max clock speeds shows there's some deeper changes than the release notes would indicate too, guessing this is also making it's way into the newest Linux 18.40 and on core drivers and then the open source code and Mesa and other distros so it should also improve things beyond just Windows. :)
    (Though several of the code commits that have already landed already has things like Vega20 workstation GPU code and even parts for Navi so it might actually be ahead in some areas.)

    For HBM2 the target was 1 Ghz at 1.2v I believe but it ended at Vega 56 hitting 800 Mhz at 1.25v and Vega 64 hitting 945 Mhz at 1.35v and then Samsung supplies coming up short so Hynix is often found on Vega 56 instead which has a lower max clock before it artifacts or crashes.

    Besides voltage the sensors also drops timings once the modules reach 65 degrees Celsius and there might or might not be a degradation issue with the extra voltage and higher clock speeds although at 1050 to 1100 Mhz is where the memory really makes a good difference in performance if it's within temperature margins before it starts throttling.

    An early issue with Vega 64 is also the idle memory clock speeds at 165Mhz I believe it was causing some stability concerns but I think it's sorted now, Vega 64 has 165Mhz, 500Mhz and the full speed 945Mhz by default whereas Vega 56 uses 700 Mhz and full speed 800 Mhz although I'm on Hynix myself and it's the "Nano" PCB though with a full-length cooler so there might be differences from that too.


    For the core clock voltage and power draw was the main drawback, AMD uses a very high stock voltage to reach target speeds for both Vega 56 and Vega 64 with the 64 and water cooled edition pushing a bit higher but as a result there's a lot of GPU's that can scale down 100 - 150 or even 200mv without dropping clock speeds much if at all but if there's plenty of power available then the other way to do it by simply ensuring the GPU has enough power to feed it while maintaining temperature is also a way and might see higher clock speeds still.

    From the discussion on the improved scaling in 18.10.1 and newer it looks like AMD has significantly improved the automatic behavior although it's best when it's near or at 100% load where it can now scale up to 1730Mhz if there's no limits holding it back and it regulated voltage as needed up to the maximum cap which I believe is 1250mv though it's rarely that high.
    Makes me wonder if a higher power limit or a modded higher than default max power limit with these improvements would see even higher gains. :)

    Although the card does eventually tap out at 1700 Mhz core clocks and around 1100 Mhz memory clocks though benchmarks might still see some gains above that but from the results I've compared and checked it looks like you can get up to a 8 - 10% performance increase and then it drops off with memory being responsible for almost half of that so while that's not the highest gains from overclocking that does show that memory speeds and bandwidth is an important factor for Vega.


    Overall or TL/DR the card is a interesting piece of tech even if some features didn't make into the release but the numerous parameters and thresholds and limits can make it a bit troublesome to tweak but it's looking like AMD is now making some nice improvements in this regard but the main thing is that the card needs a bit extra power and you have to ensure it's within the temperature margins which for a water cooling kit shouldn't be a problem.

    I don't believe a water cooling bios is going to yield any major improvements but I can't say for sure and I am not 100% certain either how reliable flashing it would be, you do have a backup bios switch on the GPU but with the different PCB and components it might not be completely stable and results from other reports are pretty inconclusive with some saying it works and others saying the GPU just crashes until they switch bios position and then flash back the original.
    (Not too sure what else is in the bios either, the editor or viewer might only cover some of the total data in there though comparing might show some of the main differences between stock, water cooled and Sapphire's Nitro bios settings.)

    There's the Nitro+ LE as well, primarily there's a difference in the cooler being a triple fan vapor chamber with backplate but it also has a extra 8-pin connector although even the regular Nitro+ has PCB differences so you wouldn't want to risk messing with the power delivery and whatever else Sapphire has changed which might differ in the bios file either.
    (Better to be sure before flashing too just to avoid a worst case scenario even if there is a backup bios via the flip switch and going to the secondary bios to reflash the primary file back to default.)


    EDIT: And if AMD is improving the driver support for automatically managing settings to where it sees improvements that can exceed manually tuning voltage and clocks that's quite a change too and might be the ideal for the cards now although a higher power limit is still important to remove that restraint too. :)
    (And then temps but that's the usual I guess but instead of running into throttling there's a few different steps here that together with total workload determines the final clock speeds as it boosts up to 1650 or 1700 Mhz or higher.)


    Bit of a longer reply than I was thinking it would be, I am also still learning about how these cards operate and their limits so I'm by no means a expert but Overclockers, this forum and even AMD's Reddit hangout even if it can be a bit biased at times have given some really good info and insights into how this GPU is working and what it's limits are.
     
    Last edited: Nov 2, 2018
    MaestroBayten likes this.
  8. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,962
    GPU:
    XFX 7900XTX M'310
    Also water cooling should ensure you don't have a problem with this but if flashed then Vega on air hits the temperature limit at 95 degrees Celsius if I'm not remembering this entirely wrong and that is where the card will shut down and then Vega 64 on water is 75 degrees as I recall for it's limit in the bios so a good bit lower. Less important for water cooling but on air and depending on ambient temp the GPU core can hover around 60 - 70 degrees already in demanding workload situations, just something to be aware of but with a water block in place and the temps being around 30 - 35 degrees it should be just fine though it's one area where these differ. :)
    (Even the stock model Vega 64 has a pretty impressive PCB well above the specs of what is needed and the water cooler for that edition is also quite effective and well built so it's not a problem though blindly flashing the bios on other models with inadequate cooling could be.)

    And for memory I am not entirely sure what the threshold for the HBM2 modules are for temperature but the timings loosen at 65 degrees Celsius as the first step where it drops performance though voltage should only have a singular 1.25v for Vega 56 and 1.35v for Vega 64 and then varying idle speeds though this isn't a factor when the card is under load. :)
    (With the setting in Wattman for voltage mostly acting as a floor value and that's the minimum the GPU core will drop to.)


    EDIT: Ah there's the info I was looking for in regards to Linux and adjusting the GPU parameters a bit.
    https://www.reddit.com/r/Amd/comments/8weeln/you_can_undervolt_vegas_in_linux_now/

    Although from the looks of it the newer kernels already have support for it and that works better than using a mask like this for the power play states.

    Though it would certainly be convenient to just set the values for voltage, fan parameters, clock speeds and such directly in bios, test for stability and then be done with it, unfortunately yeah even the non Frontier Vega GPU's have checks against valid bios in hardware so I don't believe there's any way currently past that.

    Viewing is possible at least for comparisons sake but not much beyond that for what you can do with the bios and then flashing valid bios files but how well that works on customized models seems pretty varied from results I've been reading about.
     
    Last edited: Nov 2, 2018
  9. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    And the problems are back, FPS drops again...is it maybe the new driver that changes the voltage like i read of some posts on the Vega thread?
     
    Last edited: Nov 2, 2018
  10. JonasBeckman

    JonasBeckman Ancient Guru

    Messages:
    17,564
    Likes Received:
    2,962
    GPU:
    XFX 7900XTX M'310
    Hmm from what I've read it does work best when GPU load is at or near 100% so if that's fluctuating maybe the automatic adjustments aren't doing a good job and it drops clock speeds by a bigger margin thus a larger performance drop?

    EDIT: Well if that is the case at least anything to monitor GPU load and clock speeds would show how it's changing without too much difficulty. Radeon Settings own overlay might suffice.
    Although if that's fine then the framerate drop comes from elsewhere which could be anything from game specific or game engine specific to driver quirks or other reasons entirely.
    (It's quite a complex thing, even the type of workload the GPU is doing can give varied results even if overall workload is still near 100%)
     

  11. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    Most of the drops are on MOBAS but the GPU usage is around 10-30%, im doing some OC's at my CPU and see how it works.

    Maybe some games are just poorly optimized for AMD.
    I've read some posts about the better AUTO VOLTAGE do you guy's think i should keep my UV settings or go for the Auto Voltage?
     
    Last edited: Nov 5, 2018
  12. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,838
    GPU:
    TiTan RTX Ampere UV
    Test one then the other then pick better ;)
     
  13. evolucion888

    evolucion888 Guest

    Messages:
    370
    Likes Received:
    5
    GPU:
    MSI Radeon RX 6900
    I have an MSI RX Vega 64 Liquid Cooling edition, and with Far Cry 5 maxed at 2K, using the Turbo profile or auto voltage (Not in balanced), core clock would be at 1,660MHz average, and the power consumption around 355W. Noticed that the Vcore was 1150 at P6 and 1250 at P7. So for testing, I dialed it to 1050 for P6 and 1150 fpr P7 and now the core clockspeed is stable around 1,685MHz and the power consumption around 280W. I dialed it to 1025 and 1125 but no changes at all. So definitively AMD loves to overvolt their GPUs for the sake of yields.
     
    Maddness likes this.
  14. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    Not working for me.
    If i keep everything on stock and UV P6-P7 it will crash.
    I need to reduce P7 frequenzy to a 1647MHz max so it wont crash (while UV). Everything above causes the GPU to crash. Blackscreen, some white and purple stripes and audio keeps playing till it will rebot after a minute or two.
     
  15. OnnA

    OnnA Ancient Guru

    Messages:
    17,981
    Likes Received:
    6,838
    GPU:
    TiTan RTX Ampere UV

  16. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    On that Forum, there is a guy named 'miklkit'.
    He got the same problems, card seems stable but at some points it just throttles down.
    Noticed it yesterday while playing Destiny 2 max settings avarage 70-110 FPS and than it goes down to 59-70 FPS. Also on some point of the map it throttles around that "low".
    Temps. look good on CPU and GPU.
    CPU ~50-55C° usage 40-50% max.
    GPU ~53-56C° usage 98-100%.
     
  17. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    I was wondering, since i bought the GPU used.
    If there is an other BIOS installed, could i check it somehow?
    Could it cause the card to be unstable?
    Does the BIOS got deleted when you set it in a different PC or reinstall Drivers?
     
  18. -Tj-

    -Tj- Ancient Guru

    Messages:
    18,107
    Likes Received:
    2,611
    GPU:
    3080TI iChill Black
    No bios is permanent :)

    You can check this page for stock bios
    https://www.techpowerup.com/vgabios/?manufacturer=AMD&model=RX+Vega+64


    That fps drop is still a bit strange, what happens in vrm area, temp. wise?


    Idk there seems to be some throttling going on, maybe check pcie settings in bios and disable any power saving feature. On my asus mobo there is a option aspm for pcie might want to look into that if you have something similar


    One is by pch
    Pcie dmi link aspm control

    Other by NB pcie
    Dmi link aspm control
    Peg-aspm

    Aspm is active state power management



    I disabled them as precaution, not 100% sure if it fixes all, but I dont need such feature on desktop.. laptop maybe to get that extra battery life.
     
    MaestroBayten likes this.
  19. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    Alright, GPU BIOS seems to be the right one.

    Could not find any settings you told me on my MB.

    Temps are fine, ill check the Voltage and attach some pics tomorrow.

    EDIT1 :
    Idle[​IMG]
    Normal FPS[​IMG]
    FPS DROP
    [​IMG]

    EDIT 2:
    I saw the High Frequenzy there and lowered it a bit, seems to be more stable after all.
    [​IMG]
    Seems the GPU trys to troll me over and over again.
    Now i keep the UV on P6 and P7, OverdrivenNTool did not work for me, mostly caused me instability. With Wattman seems to work fine after all.
    Ill try some more things and see if i can lower the Wattage of the GPU.
     
    Last edited: Nov 8, 2018
    -Tj- likes this.
  20. MaestroBayten

    MaestroBayten Member Guru

    Messages:
    124
    Likes Received:
    29
    GPU:
    Nitro+ 6950xt
    Does anyone of you use Two different monitor?

    Im on a 144hz and 60hz could someone check if it makes a difference using two different hz frequenzys.
    Also if it makes a difference using 2 DP or 1DP/1HDMI.
     
    Last edited: Nov 9, 2018

Share This Page