# What is frameview, PClatency stats meassuring?

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by klunka, Dec 3, 2022.

1. ### klunkaMember

Messages:
34
6
GPU:
1080ti / 11gb
Frameview 1.4 added PC latency stats
https://www.nvidia.com/en-us/geforce/technologies/frameview/release-notes/

How this latency is calculated is not described, however they previously posted picture like this:
https://www.nvidia.com/content/dam/...reflex-end-to-end-systeme-latency-pipline.png

To make sure what part of the latency chain I'm meassuring, I tried to reverse calculate the value from the frameview output.

Here are the first five frames of a benchmark in overwatch:
https://i.imgur.com/TloxSWr.png

The fifth frame has PCL of 3.962ms.
https://github.com/GameTechDev/PresentMon
Scroll down, there is a formula for latency: "msInputLatency = msBetweenPresents + msUntilDisplayed - previous(msInPresentAPI)"
This formula should cover the bulk of PC latency
https://i.imgur.com/f9mYmTj.png

Note that there is no composition latency in the example(MsRenderPresentLatency and MsUntilDisplayed are the same).
Also note there is less than one frame queued at any time(Render Queue Depth is less than 1).
Applying the presentmon formula I get: msInputLatency = 1.514 + 0.348 - 0.256 = 1.606ms
There is quite gap between PCL 3.962 and presentmon 1.606 result.

The part of PCL not covered is "USB SW".
From the time USB fires interrupt to when csrss and dwm are finished processing the input is always less than 0.1ms (confirmed with etl trace).

Only Variable left is time between when dwm finishes input processing and the time sampling starts.
Lets assume the worst case scenario:
After presenting frame nr. 3, Sampling starts but right then dwm finishes processing input, barely missing the train and now has to wait for next cycle.
The input eventually gets sampled while gpu is already rendering frame 4.
Finally the input is converted in frame 5.
Formula: MsBetweenPresents(frame4) - MsInPresent(frame3) + MsBetweenPresents(frame5) + MsUntilDisplayed(frame5)
1.708 - 0.262 + 1.514 + 0.348 = 3.308ms

For fun, assume PCL starts at the time of interrupt: 3.308 + 0.1 = 3.408ms
3.408ms vs 3.962ms(pcl) might not seem big difference, but this was worst case scenario and fps are very high.

Where is the additional latency coming from?
Is the game sampling and then cashing the result, instead of using it for simulation immediately?
Did I miss something obvious?
Thanks for reading, sorry for long post.

2. ### SmoughMaster Guru

Messages:
937
282
GPU:
GTX 1660
I'd use Latencymon if i were you, gives a good overall idea of the latency in your system, driver based, if the results you get in that are low enough (less than a 150) then your system is ok.

Messages:
34