What is frameview, PClatency stats meassuring?

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by klunka, Dec 3, 2022.

  1. klunka

    klunka Member

    Messages:
    34
    Likes Received:
    6
    GPU:
    1080ti / 11gb
    Frameview 1.4 added PC latency stats
    https://www.nvidia.com/en-us/geforce/technologies/frameview/release-notes/

    How this latency is calculated is not described, however they previously posted picture like this:
    https://www.nvidia.com/content/dam/...reflex-end-to-end-systeme-latency-pipline.png

    To make sure what part of the latency chain I'm meassuring, I tried to reverse calculate the value from the frameview output.

    Here are the first five frames of a benchmark in overwatch:
    https://i.imgur.com/TloxSWr.png

    The fifth frame has PCL of 3.962ms.
    https://github.com/GameTechDev/PresentMon
    Scroll down, there is a formula for latency: "msInputLatency = msBetweenPresents + msUntilDisplayed - previous(msInPresentAPI)"
    This formula should cover the bulk of PC latency
    https://i.imgur.com/f9mYmTj.png

    Note that there is no composition latency in the example(MsRenderPresentLatency and MsUntilDisplayed are the same).
    Also note there is less than one frame queued at any time(Render Queue Depth is less than 1).
    Applying the presentmon formula I get: msInputLatency = 1.514 + 0.348 - 0.256 = 1.606ms
    There is quite gap between PCL 3.962 and presentmon 1.606 result.

    The part of PCL not covered is "USB SW".
    From the time USB fires interrupt to when csrss and dwm are finished processing the input is always less than 0.1ms (confirmed with etl trace).

    Only Variable left is time between when dwm finishes input processing and the time sampling starts.
    Lets assume the worst case scenario:
    After presenting frame nr. 3, Sampling starts but right then dwm finishes processing input, barely missing the train and now has to wait for next cycle.
    The input eventually gets sampled while gpu is already rendering frame 4.
    Finally the input is converted in frame 5.
    Formula: MsBetweenPresents(frame4) - MsInPresent(frame3) + MsBetweenPresents(frame5) + MsUntilDisplayed(frame5)
    1.708 - 0.262 + 1.514 + 0.348 = 3.308ms

    For fun, assume PCL starts at the time of interrupt: 3.308 + 0.1 = 3.408ms
    3.408ms vs 3.962ms(pcl) might not seem big difference, but this was worst case scenario and fps are very high.

    Where is the additional latency coming from?
    Is the game sampling and then cashing the result, instead of using it for simulation immediately?
    Did I miss something obvious?
    Thanks for reading, sorry for long post.
     
  2. Smough

    Smough Master Guru

    Messages:
    937
    Likes Received:
    282
    GPU:
    GTX 1660
    I'd use Latencymon if i were you, gives a good overall idea of the latency in your system, driver based, if the results you get in that are low enough (less than a 150) then your system is ok.
     
  3. klunka

    klunka Member

    Messages:
    34
    Likes Received:
    6
    GPU:
    1080ti / 11gb
    Thanks but not what I'm looking for.
     

Share This Page