pre-rendered frames in nvidia drivers?

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by alexander1986, Sep 12, 2016.

  1. Thanks for the links, I will test the games that I change the maximum prerendered frames for placebo. I tried using gpuview cmd logging in the past, but I need to read up more on how to decipher all the information being recorded (30 secs to 1 minute with gpuview can make massive log files because it is recording everything going on your machine performance wise).
     
  2. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,363
    Likes Received:
    1,822
    GPU:
    7800 XT Hellhound
    I once tested BF4 MP in 720p to see if there is a noteworthy difference between forced prerenderlimit 1 and app controlled in CPU-limit, and min-fps hardly changed at all.
    When Heores of the Storm was still DX9, setting prerenderlimit to 1 cost ~10% in GPU-limited scenarios (not CPU-limited). Never seen such a difference in DX11 games.
     
  3. I have not done an analysis with ProcessMon yet, but I did use GPUview (with verbose to capture everything) to get a look at the frametime analysis and I think without vsync enabled the difference between using default, 1 max prerendered frame ahead, and 6 max prerendered frames ahead is only 1-3 ms (milliseconds) difference. In my case running AC4BF, in all cases, seems default (3) offers best performance, but 6 is within 1 ms, and 1 frame ahead offers the worst performance (probably can factor my cpu for the reason). If I missed something please feel free to offer corrections since I'm not an expert with GPUview.

    [​IMG]

    [​IMG]

    [​IMG]
     
  4. alexander1986

    alexander1986 Master Guru

    Messages:
    201
    Likes Received:
    21
    GPU:
    RTX 2060


    couldnt this vary greatly from game to game, how their engine works and so on? atleast from what I've heard :S
     

  5. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    "Values from 1-8, the default is 3 by the driver and I would not recommend higher than that. A value of 1 or 2 will reduce Input Latency further at the cost of slightly higher CPU performance cost"
    if i use a value of 1 in all games, i get less fps compared to the default value of 3
     
  6. Ribix

    Ribix Guest

    Messages:
    4
    Likes Received:
    0
    GPU:
    Geforce GTX 560 Ti 1GB
    You have got this wrong. The present packet (orange) is the end of a single frame, the black packets are the information the gpu needs to render the frame. So your first image shows 1 complete frame in the cpu queue and the beginning of a new frame near the end. When the orange packet is removed from the queue is when the gpu would have flipped the front and back buffers and it would have been output to the monitor. If you are cpu bottlenecked the cpu queue is not going to be able to be filled, so maximum pre rendered frames will have almost no effect on input lag. It will have the biggest effect on input lag, when either gpu bottlenecked or with vsync on.
     
  7. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,363
    Likes Received:
    1,822
    GPU:
    7800 XT Hellhound
    Please post concrete numbers and how to reproduce. Everything else is worthless.
     
    BlindBison likes this.
  8. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    Alright i have used it on oblivion with 144fps cap.
    with the default settings in Bravil town square it easily reaches steady 144fps. But with setting pre-rendered frames of 1 the games fps becomes unstable dropping to 100~130.
    With weaker cpus default value (or high if u want the incresed input delay) is better. with stronger cpus its something else.
     
  9. I rushed the results and read the gpuview manual further. You are correct that the orange present packet is equal to a single frame and the black dma packets are associated with building the frame. The cpu context queue and hardware 3D queue seems to be combined an entry called the device context.

    The point I'm trying to make is I believe that adjusting the maximum prerender frames on game by game basis can help increase GPU utilization by making sure the GPU is not waiting on the CPU by making sure the cpu context queue is always full, around 100% usage (being gpu bottlenecked is the best case).
     
  10. Sabbath

    Sabbath Maha Guru

    Messages:
    1,213
    Likes Received:
    350
    GPU:
    RTX 2080 Super
    pre-rendered frames in nvidia drivers=2 play'ed doom(4) all day every thing was silky smooth. 4m any way.
     

  11. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,363
    Likes Received:
    1,822
    GPU:
    7800 XT Hellhound
    Interesting. Could you test something with DX11 as well? I may give Far Cry Primal a shot.
     
  12. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    So i ran a benchmark on csgo comparing the results of maximum pre-rendered frames 1 and 3

    Maximum pre-rendered frames 3:
    ==========================================================================
    # FPS Benchmark v1.01 - 01:43:140
    ==========================================================================
    - Test Results Below: Average

    Average framerate: 269.83

    Maximum pre-rendered frames 1:
    ==========================================================================
    # FPS Benchmark v1.01 - 01:43:140
    ==========================================================================
    - Test Results Below:

    Average framerate: 274.52

    tl;dr result above invalidates my theorey and got a nice 4fps increase.

    The benchmark is here
     
    Last edited: Sep 17, 2016
  13. -Tj-

    -Tj- Ancient Guru

    Messages:
    18,097
    Likes Received:
    2,603
    GPU:
    3080TI iChill Black
    Some games are best with app controlled, some with either 2 or 1. If 1 makes fps drops then use 2 if app controlled feels weird (inconsistent).

    I remember one issue by DeusEX HR, the game would crash unless its set to app controlled. I think they fixed DeusEx later, well it has vsync input lag and 60fps limit isn't enough.


    And as few said in most cases 60fps limit+ 60hz vsync and 2 frames is "ideal".




    I tested RoTR dx12 once and noticed 2 or 3 gave the best min and avg fps..
    http://forums.guru3d.com/showpost.php?p=5244721&postcount=1020
     
    Last edited: Sep 18, 2016
  14. RealNC

    RealNC Ancient Guru

    Messages:
    4,959
    Likes Received:
    3,235
    GPU:
    4070 Ti Super
    This setting is only useful when using vsync. I keep at 1 globally. I have never observed any ill effects in any game I have ever played.

    In games where I can't maintain 60FPS, setting it to 1 gives an "even stutter". Setting it to 3 or auto gives "uneven stutter", where the game is smooth for 0.3 seconds, then a big stutter, then smooth for another split second, then a big stutter again.

    Both are bad and I can't see why the latter would be preferable to the former. So I'd recommend setting it to 1 globally to reduce input lag when using vsync. In combination with a 60FPS frame rate cap (using inspector frame limiter v2), this gives the best possible experience when using vsync.
     
    Last edited: Sep 19, 2016
  15. Mda400

    Mda400 Maha Guru

    Messages:
    1,089
    Likes Received:
    200
    GPU:
    4070Ti 3GHz/24GHz
    Driver's pre-rendered frames settings will take precedence over application's pre-rendered frames setting. They are two different settings so from my experience since bf:bc2 its best to keep application's limit untouched and adjust control panel's setting only.

    And i can tell that you are not fully realizing what the pre-rendered frames setting does if you claim you lost gpu performance in a non-cpu limited scenario... Thats more of the applications fault, not the pre-render setting which controls how many frames the cpu can prepare before they go out to the gpu (to alleviate gpu waiting for the cpu in return for latency when input is being delayed by waiting for the cpu longer than usual).

    Only way i know you lose gpu performance in non-cpu limited scenarios for example, is if you flooded the graphics driver and API with too many calls from the CPU (even further example, setting imax_threads greater than ~3 in serious sam 3: BFE as it uses a multithreaded render).
     
    Last edited: Sep 20, 2016

  16. aufkrawall2

    aufkrawall2 Ancient Guru

    Messages:
    4,363
    Likes Received:
    1,822
    GPU:
    7800 XT Hellhound
    Who did say anything else? Why are you quoting me on this?

    I described what once reproducibly happend in reality, your reproach is just an inappropriate ****-move.

    Is it my fault that you didn't test the same game as me?
     
  17. AsiJu

    AsiJu Ancient Guru

    Messages:
    8,811
    Likes Received:
    3,369
    GPU:
    KFA2 4070Ti EXG.v2
    Prerender limit = 1
    Vsync = On (or Adaptive)
    Triple Buffering = On
    (FPS cap = Refresh Rate)

    => best possible visual experience with smallest input lag I'd say. Ofc Vsync can be turned off too but that will cause tearing even with fps capped.

    Why cap FPS with Vsync you ask? Well, even if the game is vsynced more frames can be rendered internally (if triple buffering works correctly), so this can further alleviate input lag as no frames need to be discarded.

    Worth noting though that true triple buffering is almost never used anymore. What games often refer to as "triple buffering" is actually a double buffer + prerender queue.
    However prerender limit is just as meaningful with this implementation as it affects prerender queue directly.

    Unless I'm very much mistaken.
     

Share This Page