generating higher Radeon memory straps : A Method of inquiry [WIP]

Discussion in 'Videocards - AMD Radeon' started by user1, Nov 27, 2018.

  1. user1

    user1 Ancient Guru

    Messages:
    2,777
    Likes Received:
    1,299
    GPU:
    Mi25/IGP
    In this post i will detail a crude method of reversing the radeon memory timings to raw values (nanosec)
    inorder to create new straps for higher frequencies

    This is not perfect, and can be considered incomplete , many timings depend on others, and those details will not be discussed here as I do not have the appropriate documentation.

    To begin with the straps must be extracted from a vbios, this can be done manually, using the method described here(https://www.overclock.net/forum/67-amd/1561372-hawaii-bios-editing-290-290x-295x2-390-390x.html under memory timings modding) or via tools like the polaris bios editor

    then they must be decoded, via this tool (https://www.overclock.net/forum/67-amd/1629357-r_timings-encode-decode-rx-r9-memory-straps.html)

    the R9 timing strap decode works back to the hd 6000 series, older cards like the hd 5000 and 4000 series use a different encoding that is not well documented.

    We will look at a timing that is important for stability at higher memory frequencies

    TRFC/RAS2RAS (these should be the same value)

    The first thing that must be done is determining the default settings for the memory ,

    In my case we take the 2000mhz strap and the 1875mhz strap
    we must find the base clockcycle length for both

    1000ns/2000
    0.5ns per clock
    1000ns/1875mhz
    0.533333ns per clock

    then we take the timings values
    206 tRFC for 1875mhz
    219 TRFC for 2000mhz

    and we mulitply them by the clock cycle length
    1875mhz
    0.53333_x206=109.86ns
    2000mhz
    0.5x219=109.5ns

    as you can see the true cycle time is basically the same across the straps (in this case 110ns)

    from here we can generate a timing for a given frequency

    say we want a 2250mhz value for tRFC
    1000ns/2250=0.444444444_

    then we take 110ns / 0.4444444 = 247.5 which would be the new value for tRFC

    this process can be repeated for most of the timings, to generate "safe" values that are lower risk.

    This should work with almost any type of memory as well, so long as the true memory frequency is used.


    Using this for tRC and tRFC/RAS2RAS , seems to have the largest benefit for stablizing higher frequencies

    The Memory controller straps. which unfortunately cannot be decoded atm, are also important, in my case using a higher mc strap allowed me to get past a "white screen wall" that happened when exceeding a certain frequency, that otherwise showed no errors until that point.
    You may have some success grabbing the MC_SEQ_MISC1 MC_SEQ_MISC3 MC_SEQ_MISC8 values from a 2250mhz strap that exists on some rx 580s, though those values are partially memory ic vendor specfic, and may cause problems if used with another, they are also tied to the tcl value (using values from hynix on samsung memory would be bad)

    I think that exceeding 2250mhz should be possible on some polaris cards, provided higher VDDCI is used along with looser Memory controller straps + higher tRC+tRFC/ras2ras,



    Using this method i was able to get another 100mhz stable on micron memory whilst reducing the tcl value.

    performance at higher resolutions / higher AA levels benefits the most
     
    Embra and OnnA like this.
  2. SpajdrEX

    SpajdrEX Ancient Guru

    Messages:
    3,417
    Likes Received:
    1,672
    GPU:
    Gainward RTX 4070
    Any pics of performance difference at higher resolution/AA?
     
  3. user1

    user1 Ancient Guru

    Messages:
    2,777
    Likes Received:
    1,299
    GPU:
    Mi25/IGP
    its like +2% for me, lets me drop the coreclock by 30mhz or so for the same performance, which saves me some core voltage,
    this is more for those with already high core clocks, where polaris is memory bottlenecked, and the memory wont go any higher and / or tuning the timings for a specific frequency.

    in my case the card would "pass" benchmarks at 2130. but with errors, so tuning the tRC and tRFC eliminated the errors at that frequency whilst retaining most of the performance gain.

    this just a way to estimate what the timing values "should" be, inorder to eliminate some guesswork when lacking proper datasheets

    very niche stuff.
     
    hemla and SpajdrEX like this.
  4. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    This Strap on RX-580: "777000000000000022AA1C0031627C46A05510153C8F050B006AE60004061420EA8940AA030000001914292EB0313C18"
    RX-580 vs Fury X.
    It is still not final, but it is OK.
     
    user1 likes this.

  5. user1

    user1 Ancient Guru

    Messages:
    2,777
    Likes Received:
    1,299
    GPU:
    Mi25/IGP
    Minor update, was able to get 2200mhz mclk "stable" with garbage tier micron ics, seems that at 1400mhz core, get about 16% more bandwidth over (stock)1750mhz mclk, the most interesting thing i noticed however , was that going to 1450mhz core, gave an extra 3gb/s, this did not happen at stock mclk or 2000mhz mclk , which suggests that the memory contoller is in part choked by the core clock at higher speeds.

    I would guess that , inorder to actually get a meaning full benefit of >2250mhz mclk you probably need the core clock to be 1500mhz+ to see improvements from the increased bandwidth,

    seems the best route for lower core speed polaris cards(and lower sp counts), that tightening the timings for <2100mhz mclk would probably the better route.
     
  6. robolee

    robolee Member

    Messages:
    35
    Likes Received:
    1
    GPU:
    NITRO+ RX 570 4GB
    Are 1000ns baseline for the formula?
    You also mention about tRC, so what are the formula for it? :D
     
    Last edited: Dec 17, 2018
  7. user1

    user1 Ancient Guru

    Messages:
    2,777
    Likes Received:
    1,299
    GPU:
    Mi25/IGP
    this method is kind of pointless now, amd's new drivers seem to set timing values automatically past the straps in the vbios of the card, for instance prior to 18.12.2, stock my card would not run anything past 2135mhz memclk and was highly unstable at that, and was not completely stable past 2040mhz, now ive set it as high as 2175 and its perfectly stable as far as i can tell, not a placebo as the bandwidth is higher than was ever possible before.
     
  8. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    Nothing prevents you from re-strapping vBIOS. I removed 300MHz one and added 2250MHz.
     
  9. user1

    user1 Ancient Guru

    Messages:
    2,777
    Likes Received:
    1,299
    GPU:
    Mi25/IGP
    Never said you couldn't , I find that , it may not be worth the effort since the auto tuning the driver does is pretty decent on its own.

    update:
    was able to set 2275mhz memclock and run a couple tests, amd has definitely implemented straps via the driver. tried 2300 but hit the gray screen problem indicating imc failure(kind of alot to ask). I think this approach to memory clocking will greatly benefit pretty much any amd card that supports the new actimingtuning mechanism amd has implemented,

    a very interesting development, really curious how vega or newer cards react to these changes.
     
    Last edited: Dec 19, 2018

Share This Page