True Double Buffering

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by BmB23, Jun 7, 2022.

  1. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    You can see the games framerate limiter is set to 144, and Special K's is disabled, the present interval is 1 so it's not 1/2 sync. The only thing keeping it at 60 is double buffering, because my gpu is not powerful enough to run it at 120. If I increase the settings, it drops to 40, another multiple. This is the behaviour you are looking for to verify double buffering is active.

    Explicitly setting 2 buffers does not work in any other games, and I haven't set the buffers in this one either as you can see it is -1. It doesn't work in Left 4 Dead which is a game that should only have double buffering in the first place. The really interesting thing is that if you enable the in-game vsync option, it's still triple buffering, but if you force it on, it's double buffering. Forcing it on in other games just gives triple buffering. Whatever makes it work is unique to Fallen Order.

    I was only trying Special K here to see if the texture cache could do something about the notorious stuttering the game has.
     
    Last edited: Jun 12, 2022
  2. Sajittarius

    Sajittarius Master Guru

    Messages:
    490
    Likes Received:
    76
    GPU:
    Gigabyte RTX 4090
    Fallen Order is Unreal Engine 4 IIRC, you may have the same luck with other games using that engine. Ironic, because Kaldaein (specialk creator) has mentioned multiple times on other forums that Unreal is a pain to work with.

    Also, all Unreal Engine games have a config file somewhere, and you will usually find the fps/framelimit/buffering settings there, usually Engine.ini or GameUserSettings.ini. Might give a clue as to what settings the engine is using to achieve the desired result.

    Fallen Order config locations:
    %LOCALAPPDATA%\SwGame\Saved\Config\WindowsNoEditor\GameUserSettings.ini
    %USERPROFILE%\Saved Games\Respawn\JediFallenOrder\GameUserSettings.sav (probably encrypted, cant just open in notepad)
     
  3. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    Aright, I tried it on Maneater, another UE4 game. And it worked, once again the ingame vsync option is triple buffered, but forcing it with Special K is double buffered. I then tried it with a Unity game, Tiny Combat Arena, and once again explicitly setting the buffer count to 2 is still triple buffered.

    [​IMG]

    It seems that UE4 is somehow exempt but only in combination with Special K. If you know of another program that can force vsync on it would be interesting to test it.
     
  4. Martigen

    Martigen Master Guru

    Messages:
    535
    Likes Received:
    254
    GPU:
    GTX 1080Ti SLI
    Was reading the Discord today and saw this, lucky I thought of you :) --

    "for DXGI, yes,
    but that's double buffering there
    you need 3 in DXGI
    2 is for triple buffering in D3D9
    because naming mismatch"

    In short -- For DX11 games set buffers to '2' for double buffering, for DX9 based games, set it to '1' for double buffering.
    Join the Discord! There's a wealth of information there and people more knowledgeable than I.

    Also, you're not using the latest version, which of course may help with compatibility. Again grab it from #Installers in the Discord, or the link I provided on the previous page.

    Also also:
    Yeah I missed that before. So what happens if you enable Special K's Frame Limiter and set it to 60? And with buffers set to 1, as above. And possibly also try with Waitable Swap Chain off and on.

    Also note not all frame limiters are made equal -- Special's Ks, RTSS, the driver's own limiter all work slightly differently and Special K's is likely the best of them. So if you're going to get a smooth ride with frame limiting, Special K will probably be able to do it -- depending on its various features that might need toggling. The Wiki and Discord will help here.
     

  5. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    tiny combat arena is a dx11 game as you can see, and setting 2 buffers does not work.

    in my experience no framerate limiter is by itself enough, only with vsync is true smoothness achievable. but triple buffering can only lock to the refresh rate, and 1/2 sync is for whatever reason not smooth, which leaves double buffering.

    you can see that when the framerate is limited by vsync, the frametime graph as measured by software, is wiggly and inconsistent, but the actual presentation is ultra-smooth. when the framerate limiter is active, frametimes are by its own measurement perfect, but the presentation is not smooth. so there is an error in how frametimes are measured in software which is probably why framerate limiters are not able to be smooth.

    It is not needed, double buffering works in the game when vsync is forced by special k. So the framerate is limited to 60 only by pure vsync. It works despite NOT setting the buffer count to 2, which is a mystery. And it does NOT WORK in other DX11 games, despite EXPLICITLY setting their buffer counts to 2. truly maddening. Something is going on with the driver or windows.
     
    Last edited: Jun 14, 2022
  6. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,500
    Likes Received:
    1,875
    GPU:
    7800 XT Hellhound
    That's not it, RTSS/AB graph for whatever reason just doesn't necessarily show how frames are presented in reality.
    I can see that in Heroes of the Storm with DXVK graph, it shows lots of noticeable stutter that RTSS graph doesn't show. But that doesn't mean fps limiters produce garbage presentation intervals in general. I can see very stable VRR refresh rates in my monitor's refresh rate OSD with the Nvidia driver limiter and there is zero noticeable jumping when panning the camera etc. (e.g. in Strange Brigade). It depends on the game.
     
    DaRkL3AD3R likes this.
  7. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    That's not a case of not showing how frames are presented in reality. Proper presentation timings alone are NEVER promising you smooth animation, to be smooth presentable frames must also contain proper content, input must be sampled and world must be simulated with no "stutters". Also, for those who confuse present-to-present vs start-render-to-start-render frametime measurement approaches, it is useful to peek into "Frametime calculation point" option context help in RTSS.
     
    Last edited: Jun 14, 2022
  8. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    There is an error in your knowledge causing the error in your statements.
     
  9. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,500
    Likes Received:
    1,875
    GPU:
    7800 XT Hellhound
    Indeed switching to "frame presentation" as frametime calculation point makes RTSS graph show the same stutter as DXVK graph. It also seems to better reflect the actual frame time variance (that is mirrored by monitor VRR refresh rate OSD) of both the Nvidia driver and RTSS fps limiter. Which makes me wonder if this would be the better default setting?
    I probably got tricked when checking this in the past by not noticing that this option also needs to be changed for an individual .exe profile and not just the "global" template.
     
  10. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    If the presentation is ultra smooth, and the framerate is limited purely by vsync which is driven by the video clock, which should be a trustworthy signal, then the frametimes must be perfectly even. And if the frametime measurement shows a variance which does not exist in reality, then it is an error. This is not a personal attack, I see this with all software measurement and limiting tools.
     

  11. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    Learn what is presentation, what is rendering pipeline, what is pipeline input and output, learn what is frametime from a game engine POV and where and how it is supposed to be measured by it. You see this "error" with all software because there is fundamental misunderstanding located inside your knowledge. The variance does exist in reality becasue you don't understand what exactly you're measuring.
    Frametime measued as Present-to-Present delta (which is a case for SpecialK "error" which you tried to "prove" with your screenshot) is absolutely expected to be jitttering slightly in case of enabled VSync because physically wait for vertical blank event is not having place at pipeline input, it is having place later at pipeline output or inside blocking Present call (in case of full queue). Timings of Present calls from game side are never expected to be distributed evenly. Frametime measured as EndPresent-to-EndPresent (which is equal to FrameStart-to-FrameStart) timestamps delta will look more "even" in this case.
     
  12. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    I don't feel that it is a good idea to make it default. Most game engines and their internal (and external latency focused) framerate limiters use difference between frame rendering start timestamps to define a frametime. That's a point where game is normally sampling input and start simulating world for the next frame to be displayed. So seeing frametimes as distribution of deltas between these time points reflects framepacing quality better than Present-to-Present frametimes. When you enable latency oriented framerate limiting modes (for example Async/back edge sync modes in RTSS and low latency mode in SpecialK) this point timing is being adjusted with artifical delays opposing to Present call point timings, so seeing frametime calculated such way reflects framerate limiting quality better than Present-to-Present frametime.
    Switching to "frame presentation" based measurement mode only makes sense to compare and understand the nature of different implementations. It can also be used to validate quality of framepacing when scanline sync or front edge sync framerate limiting modes are enabled (both adjust Present call point timings).
     
    aufkrawall2 likes this.
  13. Sajittarius

    Sajittarius Master Guru

    Messages:
    490
    Likes Received:
    76
    GPU:
    Gigabyte RTX 4090
    you know whats better than all this? g-sync/vrr

    (i know not everyone can or wants to buy a more expensive monitor, but as someone who is extremely sensitive to vsync stuff, I will say it was money well spent, personally)
     
  14. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,038
    Likes Received:
    7,379
    GPU:
    GTX 1080ti
    not all vsync is implemented the same
    not all frame limiters are implemented the same.

    ffxiv 72 fps limit has a frametime variance of 7ms, from 13.8 to 20.8.
    rtss 72fps limit has a constant 13.8ms
     
  15. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    The reality stands that this is the case with all software limiting tools, and is nothing unique to do with RTSS, which are never perfectly smooth the way double buffering is. When the video is perfectly smooth, all framerate measurement tools show a variance which is not there. When the tools show even frametimes, then the visuals are not smooth.
     

  16. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,038
    Likes Received:
    7,379
    GPU:
    GTX 1080ti
    the variance is always there because framerates are not solid integer.
     
  17. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    That's not root reason of his wrong claims. Topic starter misunderstands fundamental principles of presentation, feels no difference between rendering pipeline input and output and seems to assume that presentation takes place immediately, which is never the case. Presentation is asynchronous process, there is always unpredictable and variable time delta between the moment when 3D application is calling Present and between seeing new frame actually displayed. Frame is not even ready to be displayed when Present is called by 3D application, 3D API and driver just save save command buffer contents, put it to queue and start rendering it after that when GPU is ready to do so (which may also be delayed due to finishing rendering some previous queued frames). Frametimes measured by any 3D application, ingame benchmark (or "errorneous" monitoring tools) showing timings related to pipeline input, and those timings are correct. Once he learn these basics, he'll understand that.
     
    BlindBison and Cave Waverider like this.
  18. BmB23

    BmB23 Active Member

    Messages:
    93
    Likes Received:
    34
    GPU:
    GTX 1660 6GB
    If there was a variance in frametime due to small differences in synchronization delays, then that would show up as animation errors due to delta time measurement, giving an unsmooth result, which is not the case. Typical game loops measure delta time from the beginning of the frame to the beginning of the next frame, which would typically begin after returning from present call, which is not identical to the common english term "presentation". So whatever variance is due to synchronization waiting is included in this measurement, and does not matter as long as the actual frametime is consistent, which vsync will ensure.

    What these software tools do is insert themselves somewhere with a hook, do their own measurement, and add their own delay based on that measurement. I don't know how exactly this works, and it's probably different from tool to tool, as you point out RTSS even has different modes to configure. But I do know that I never see correct results from them, so there is an error in how they measure, which leads to errors in how long they wait, which leads to unsmooth presentation, which is again not meant in the sense of a call to the present API.

    Not that games are always perfect, some implementations just are not smooth, likely due to errors in the game loop itself, or some bad implementation of framerate limiting with sleeping, such as is the case in the Unity engine. But this can usually be overcome by unlocking the framerate and forcing vsync, which leads to the topic of this thread, which is that double buffering is the only smooth way to do fractional vsync, such as 30fps on a 60hz monitor, or 40fps and 60fps on 120hz monitors, and double buffering cannot be easily enabled on later drivers and window due to some unknown override of how vsync works.
     
  19. Unwinder

    Unwinder Ancient Guru Staff Member

    Messages:
    17,194
    Likes Received:
    6,865
    No need to tell me what these tools do. Learn basics then come back.
     

Share This Page