Why is RTSS FPS Limiting more accurate than other framerate limiters?

Discussion in 'Rivatuner Statistics Server (RTSS) Forum' started by BlindBison, Apr 7, 2023.

  1. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    I have seen people suggest that in-game limiters are not as accurate because they are geared towards getting the lowest input delay rather than having the most perfect frametimes. Fair enough though it surprises me at least some devs would not opt for the more accurate frametime approach vs raw latency. To date using RTSS to monitor performance I have not found an in-game limiter that adhered as strictly to the set frametime limit like RTSS seems to.

    This even appears to extend to other external limiters too. For example when I limit the framerate with Nvidia's driver level limiter or their V2 limiter from Inspector -- although highly accurate in frametime -- it seems to veer just slightly one way or the other of the set framerate when reaching the cap whereas the readings indicate RTSS does not veer at all.

    I'm forgetting the name but I also tested another application with framerate limiting capability a long way back and the frametimes when measured with RTSS were much less stable than RTSS's limit itself. Is the answer that it's complex on the development end of things to make an accurate framerate limiter in the vein of RTSS and thus it simply is not done very often? Or perhaps Nvidia and RTSS are very similar and it comes down to how RTSS measures the frametimes?

    I'm curious to know what differs in the implementations of these limiters that they have such a different level of accuracy. From what I read RTSS and Nvidia's new limiter are CPU based limiters to more closely control frametimes but I do not have any technical knowledge or insight into what that really means. Anyway, if anyone happens to know I'd love to learn more about how framerate limiting actually works and why these differences exist.
     
    Last edited: Apr 7, 2023
  2. RealNC

    RealNC Ancient Guru

    Messages:
    5,105
    Likes Received:
    3,380
    GPU:
    4070 Ti Super
    There's several factors, but the most important one is how a limiter waits for the required amount of time to elapse. There's two choices:

    The first method is to tell the OS to get back to you (where "you" is the thread that is currently executing) after a certain amount of time and put you to sleep. This means an OS timer will be used. When the timer expires, the OS will wake you up and execution is resumed. This is the less accurate method, since OS timers have a rather coarse accuracy (usually 1ms) and also there's some scheduling delay, especially if the CPU has entered a low power state (which is the main benefit of putting processes or threads to sleep.)

    The second method is to not sleep at all, but instead keep checking the current time over and over again, until the target time is reached. This is very accurate (microseconds instead of milliseconds) since you're not using an OS timer. Also, thread/process scheduling delays are lower because the code that checks for the current time is constantly running and prevents the CPU from entering lower power states. This method is usually called "busy waiting".

    There's also a hybrid method that uses both (I suspect this is what RTSS does now by default unless you explicitly configure it to only use busy waiting.) You use an OS timer and sleep for the majority of the time, but for the last few milliseconds you do busy waiting. So if you need to wait, say, 12ms, you sleep and tell the OS to wake you in 9ms. The OS will then wake you maybe 8 or 10ms later, but now you busy-wait for the remaining time and thus remain accurate for the 12ms target. This still allows the CPU to sleep for the majority of the wait period. So it's the best of both worlds.

    There's other factors as well, like where exactly in the frame rendering and presentation chain you actually do the waiting. RTSS intercepts the frame presentation functions of DirectX/OpenGL/Vulkan and does the waiting there. This is accurate since frame presentation accuracy is what you actually care about. In-game limiters can do it at any point of the chain, which can mean target frame presentation times are going to be roughly correct, but not exact.

    Keep in mind that even small inaccuracies are going to result in higher FPS fluctuations the higher the FPS is. If you limit to 60FPS, and the limiter is accurate down to 0.5ms (highest accuracy of Windows OS timers), it's gonna stick close to 60FPS (59 to 61FPS.) But at 300FPS, that 0.5ms inaccuracy translates to something between 279FPS and 324FPS.
     
    Last edited: Apr 7, 2023
    midgar, BlindBison and Andy_K like this.
  3. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Thank you very much, that's helpful to know! Yeah, I wondered about how frametime variance would work out at higher framerates. Probably why the "Cap X frames beneath refresh" for g-sync/freesync has X get set to a greater value as you get to higher and higher refresh monitors since 0.5 fps variance at 60 is equal to a several frames at 240 hz (etc). That hybrid sleeping approach seems very clever. I'd be curious to know what Nvidia and RTSS do differently (if anything) in their approach to framerate limiting but since the latency and frametimes seem very close to each other (very slightly favoring RTSS in terms of frametime accuracy going off the RTSS overlay readings) perhaps there are only slight differences with broadly the same CPU based hybrid approach at this point.
     
  4. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    @RealNC Sorry to trouble you, but I have a follow up question regarding your comment here: https://forums.guru3d.com/threads/u...h-v-sync-unless-frame-limiter-is-used.434580/
    upload_2023-4-12_17-55-45.png
    and regarding Unwinder's comment here: https://forums.guru3d.com/threads/d...ate-any-pros-and-cons-to-either.444458/page-3
    upload_2023-4-12_17-56-17.png
    I have been testing traditional half refresh rate V-Sync on a laptop of mine and comparing it against a 30 FPS RTSS and/or Driver cap and/or in-game cap (I tried all of the above in several games like BFV/Overwatch/Sekiro as you can set the cap internally with a mod for that one) on my G-Sync panel.

    What I noticed was that on my laptop using half-refresh rate traditional v-sync 30 FPS appeared much smoother than on my G-Sync/Freesync panel capping to 30 (motion blur disabled on both for testing this). iirc this is something @Smough noticed too but his comparison was between console 30 (which uses half refresh rate traditional v-sync usually) and then PC 30 rather than PC half refresh v-sync vs PC G-Sync 30. At first I chalked this up to differences in the monitor technology since different panels can have different amount of persistence blur from what I read.

    But now after reading these comments on how the framerate limiting works I am thinking what it might have to do with is -- despite the CPU releasing the frame to the GPU at perfect flawless intervals (what RTSS graph is apparently measuring by default if I've understood correctly) -- the GPU is what is having variability in its render times. From there since the GPU is just outputting as fast as it can once it has the frame rendered (whereas with V-Sync it would actually wait for the "perfect" interval to release the frame to the monitor) perhaps this is what causes the visible difference in "smoothness".

    If that's the case maybe enabling "Prefer Max Performance" could improve better consistency in frame presentation on the GPU side of things? Not sure of that really. I would think to achieve perfect smooth motion you would need all 3 things to be true: CPU side accurate fps cap + GPU finishing the frame quickly then waiting to release it til just the right time + game engine logic that reads input properly without skipping frames or having weird issues. From there you'd want to measure both the accuracy of the CPU releasing the frame buffer but then also track the timing/variance of when the GPU is releaing the frame to the monitor. Or alternatively perhaps I've just misunderstood something yet again ;) Thanks,
     
    Last edited: Apr 14, 2023

  5. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Did some testing in Sekiro with RTSS, the driver limiter, the in-game limiter (as set with a mod) and then testing reading out the frametime using "aysnc" and "front edge" modes and then also when using frametime reading set to "frame start" and "frame presentation".

    1) With RTSS and Nvidia's limiter the "frame start" with async is basically flawless as shown by the RTSS overlay. Nvidia's seems to be just a tiny hair more wobbly but I dunno maybe that's related to something intentional they did with their implementation that I don't understand. In any case both are highly accurate in terms of the "frame start" measurement when using Async limiting mode.

    2) When using looking at frame presentation instead of frame start with the RTSS overlay interestingly the in-game cap, Nvidia's, and RTSS all produced roughly similar values as far as I could tell. Though the in-game cap was clearly less stable in terms of "frame start" readings when those were measured (though I read that latency is supposedly better with good in-engine limiters).

    3) When using "front edge" sync of course the tables are flipped and the RTSS limiter now has perfectly stable frame presentation frametimes (as the note states in the settings) and if you monitor from frame start that is now what is jittered/wobbly. Basically exactly what is said in the settings note. This didn't really look to my eye any smoother though with a 30 fps cap + no motion blur (off for testing since it cap makes gaps between frames a bit more obvious).

    4) I also tested this game with Power Plan for the Sekiro exe set to Normal power and Prefer max performance and the frame start and frame presentation values in the overlay did not change much if at all that I could see.

    5) I also tested this game with ULLM ON vs OFF and again in the overlay readouts there were no changes I could see between the frame start and frame presentation overlay values shown with any of the limiters.

    6) All of this was with Passive Waiting ON (default) so maybe I'll retest with that OFF at some point and see how it goes.

    I think I might switch the RTSS overlay to using frame presentation rather than frame start timings since if I'm reaching my cap for nvidia or RTSS it's pretty much assured that the frame start times will be flawless/flat so it's a bit more interesting to see the variation in frame present timings. Though again perhaps I'm missing something as the frame start is perhaps the default read out for a reason (I figure it's the default because Async option is the default and that aims for flat frame start submits).
     
    Last edited: Apr 13, 2023
    midgar likes this.
  6. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,045
    Likes Received:
    7,381
    GPU:
    GTX 1080ti
    There will only be a difference between passive and active limiting when the host process is already under highly variable load, which you're more likely to find with an MMO than a single player rail shooter.
     
    BlindBison likes this.
  7. oneoulker

    oneoulker Member Guru

    Messages:
    193
    Likes Received:
    225
    GPU:
    NVIDIA RTX 3070
    To be fair Blindbison, almost all external limiters intercepts game rendering and adds 1 frame of input delay. This can be huge. This puts 30/40 FPS capping at a huge disadvantage. I've mostly given up on external framecaps for 30/40 fps targets unless the in game cap is super busted.

    If you claimed Borderlands 3 through Epic when it was given free, go give it a try. It has a super high quality 30 FPS lock. Legit. Smooth, minimal input lag, superb aiming. I can literally pinpoint shoot enemies as if I'm at high framerate. No problemo.

    Any frame cap: be it RTSS, NVCP or Special K. Nope. I cannot anymore. I keep overshooting, I keep having problems properly aiming and putting redicle on enemies even on basic occasions.

    Built in frame cap hits different. But sadly some games have busted frame caps. External frame caps work okay above 60 FPS where the 1 frame input delay do not affect the aiming mechanics too much. But at 30/40 FPS, it could literally make it unplayable.

    TLOU became the second example to this. RTSS, NVCP 30 cap. I cannot hit enemies without aim assist. In game 30 FPS cap, aiming precisely moves as I want it to. I land shots without even using aim assist. Weird stuff to be honest. Problem is, frame pacing is busted with their in game cap.

    But you wouldn't find much of a crowd to discuss this, as 30/40 FPS is shunned by PCMR mentality.

    As some mentioned, if you gave the easy way to have the most optimal 30/40 FPS like it is on consoles; most people would be addicted to them and hardpressed to upgrade. I am. I manage to find ways to make 30/40 FPS the most optimal way to play in certain games I play and I happily accept it as a solution rather than upgrading my hardware. Now imagine if it was readily available as an option. I guess not.
     
    midgar and BlindBison like this.
  8. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Thanks that's helpful to know -- maybe I should test Guild Wars 2 or some such then.
     
  9. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    I hear you, thanks for explaining that. I have noticed something interesting for external vs internal caps in a few games like Arkham Knight and Sekiro (using mod to manually set the in-engine cap). In AK and Sek capping to an FPS value weirdly I would still get what seemed to streaming stutter or something like that from time to time. It was like whenever something specific happened in-game the in-engine fps limiter stalled. Whereas capping to the same exact value with RTSS or the driver and relaunching the game and retesting the same areas I didn't see this happen. Those external limiters were more stable in these very specific games. I have not noticed anything like this in games like Hunt Showdown though which have internal limiters.

    I used to think in-game limiters were just less accurate as a rule in exchange for lower latency -- and in fairness that seems to sort of generally be true. But its interesting how the frame presentation values are basically identical with what RTSS async produces in my tests and it's the "frame start" values where the variance comes into play. Some games also seem to not really like being limited externally and possibly have issues with camera animation stutter sometimes when being limited externally in my tests (MCC Collection when I played it around launch for example).

    I have a desktop with G-Sync, but my laptop is pretty old so I find the BlurBuster's low lag v-sync guide with nvidia inspector half refresh rate v-sync + RTSS precise fps capping or an in-engine cap set to 30 (though this can have bad frame pacing) generally produces rather smooth (for 30 fps) camera motion at an "ok" input lag value when I use a controller. 40 FPS looks so much smoother than 30 on a G-Sync panel to my eye though. Of course you're right that in-game caps can often feel noticeably more "direct" in terms of input.
     
  10. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    For this, do you know does the default "Passive" option use a Hybrid approach where it hard sleeps then for the last few moments wakes up and does active?

    iirc RealNC had mentioned that was their theory for how the RTSS limiter worked nowadays. I could not discern a difference in accuracy between the two modes by eye or frametime readout in the couple games I tested but I have no tested an MMO. Thanks
     

  11. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,045
    Likes Received:
    7,381
    GPU:
    GTX 1080ti
    iirc, Unwinder said the number you set for the passive wait period is the % of the wait period, so yeah, that sounds right.
     
    BlindBison likes this.
  12. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Thanks good to know. I'm not really seeing a value that can be manually plugged in for that under the settings though:
    upload_2023-4-13_23-16-9.png
    There is of course "refresh period" / "injection delay" / "framerate averaging interval" though, but just a checkbox for the passive waiting. Given its the default my "bet" is that it's that hybrid solution like RealNC presented just without direct user control over passive vs active time. Probably Nvidia's limiter using a similar solution?
     
  13. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,045
    Likes Received:
    7,381
    GPU:
    GTX 1080ti
    direct profile edit required.

    PassiveWaitThreshold = 90
     
    BlindBison likes this.
  14. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Ah, gotcha thanks man
     
  15. Smough

    Smough Master Guru

    Messages:
    984
    Likes Received:
    303
    GPU:
    GTX 1660
    What i would like is that most game devs (or all of them for that matter) would implement different in-game fps caps, one that is perfect in frametimes, such as RTSS and the classic ones with messed up framepacing for those that use them for some reason. No game that i know so far ever has had a correct internal fps limiter that matchs RTSS or relatively recent Nvidia's new added one. Some games have great to perfect framepacing on their own without any sort of limiters; Control at 60 fps is pinned at 16.6 ms with v-sync on no matter what, insanely optimized in this regard. Devil May Cry 5 has this too as well in DX11 mode, so hats off to Capcom for actually caring and trying, in these games i don't need RTSS fps lock at all. But these cases are not the norm in PC gaming, sadly.
     
    Last edited: May 9, 2023
    BlindBison likes this.

  16. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    I wonder if it comes down to a latency trade off. Basically all in-engine limiters seem to fluctuate in their frame start and frame present timings going off the RTSS overlay, but whatever the approach is does seem to reduce input lag somewhat as per BattleNonSense's OW and BFV tests from aways back. On the other hand though some games I basically need to use an external limiter or the game doesn't look very smooth with the in-engine one (Arkham Knight iirc would just have weird hitches with the in-engine limiter that didn't happen if I reached my RTSS target all the time).

    At the same time though I've sort of been trying to use my eyes more than the frametime graph to determine "smoothness" since some games out there simply don't look smooth with an external limiter despite perfectly flat frametime graph so now I just sort of eyeball it. Sadly that is quite flawed in terms of objective accuracy I imagine :( The reason that is maybe just comes down to the way some games don't play nice with external fps limiting or maybe it could have to do with the way they handle camera motion/input reading or some such, not sure.

    Not really related but I wonder if the front sync mode for RTSS limits both at the frame start and frame present or if it's just at the frame present and then the frame start is highly variable. I'll need to test that out. I had thought the frame start would be very jittery sort of like how the frame present is very jittery when using default Async mode, but in Witcher 3 RT DX12 when using front sync the frame start times were still very flat in the area I tested. Barely any deviation. Will need to test again I imagine to make sure.
     
    Last edited: Apr 27, 2023
  17. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Sorry to trouble you again, but I went digging for this and haven't managed to locate it. For example, I checked the profiles here:

    C:\Program Files (x86)\RivaTuner Statistics Server\Profiles

    Then edited those in notepad++. I do see the option "PassiveWait=1", but I do not see an option for the "PassiveWaitThreshold". I'll keep searching through the installDir, I expect I'm looking in the wrong place.
     
  18. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,045
    Likes Received:
    7,381
    GPU:
    GTX 1080ti
    ProfileTemplates in Global
     
    BlindBison likes this.
  19. BlindBison

    BlindBison Ancient Guru

    Messages:
    2,419
    Likes Received:
    1,146
    GPU:
    RTX 3070
    Thanks a lot -- yup, I see it now under: C:\Program Files (x86)\RivaTuner Statistics Server\ProfileTemplates\Global file. Much appreciated!
     
  20. RealNC

    RealNC Ancient Guru

    Messages:
    5,105
    Likes Received:
    3,380
    GPU:
    4070 Ti Super
    I don't think you can.
     
    BlindBison likes this.

Share This Page