Frameview 1.4 added PC latency stats https://www.nvidia.com/en-us/geforce/technologies/frameview/release-notes/ How this latency is calculated is not described, however they previously posted picture like this: https://www.nvidia.com/content/dam/...reflex-end-to-end-systeme-latency-pipline.png To make sure what part of the latency chain I'm meassuring, I tried to reverse calculate the value from the frameview output. Here are the first five frames of a benchmark in overwatch: https://i.imgur.com/TloxSWr.png The fifth frame has PCL of 3.962ms. https://github.com/GameTechDev/PresentMon Scroll down, there is a formula for latency: "msInputLatency = msBetweenPresents + msUntilDisplayed - previous(msInPresentAPI)" This formula should cover the bulk of PC latency https://i.imgur.com/f9mYmTj.png Note that there is no composition latency in the example(MsRenderPresentLatency and MsUntilDisplayed are the same). Also note there is less than one frame queued at any time(Render Queue Depth is less than 1). Applying the presentmon formula I get: msInputLatency = 1.514 + 0.348 - 0.256 = 1.606ms There is quite gap between PCL 3.962 and presentmon 1.606 result. The part of PCL not covered is "USB SW". From the time USB fires interrupt to when csrss and dwm are finished processing the input is always less than 0.1ms (confirmed with etl trace). Only Variable left is time between when dwm finishes input processing and the time sampling starts. Lets assume the worst case scenario: After presenting frame nr. 3, Sampling starts but right then dwm finishes processing input, barely missing the train and now has to wait for next cycle. The input eventually gets sampled while gpu is already rendering frame 4. Finally the input is converted in frame 5. Formula: MsBetweenPresents(frame4) - MsInPresent(frame3) + MsBetweenPresents(frame5) + MsUntilDisplayed(frame5) 1.708 - 0.262 + 1.514 + 0.348 = 3.308ms For fun, assume PCL starts at the time of interrupt: 3.308 + 0.1 = 3.408ms 3.408ms vs 3.962ms(pcl) might not seem big difference, but this was worst case scenario and fps are very high. Where is the additional latency coming from? Is the game sampling and then cashing the result, instead of using it for simulation immediately? Did I miss something obvious? Thanks for reading, sorry for long post.
I'd use Latencymon if i were you, gives a good overall idea of the latency in your system, driver based, if the results you get in that are low enough (less than a 150) then your system is ok.