View Single Post
Old
  (#61)
Unwinder
Moderator
 
Videocard:
Processor:
Mainboard:
Memory:
Soundcard:
PSU:
Default 10-29-2013, 11:16 | posts: 11,516 | Location: Taganrog, Russia

I’ve read a lot of reviews related to NVIDIA ShadowPlay during the last couple days and there are typical misunderstanding patterns in many of them, so I decided to write this short summary to give you better understanding, guys.
Most of users (and surprisingly reviewers) assume that the main performance hit for any videocapture oriented software comes from realtime video encoding. That is quite typical and mostly wrong assumption. Modern videocapture applications are multithreaded and modern CPUs have multiple cores able to process multiple tasks in parallel, so realtime video encoding is normally performed in background thread(s) and often with lower priority. That is why when you’re capturing video in the game which is not using close to 100% of CPU time on all CPU cores, background video encoding have enough system resources to run in parallel and hardly affects game performance. In other words, having fast hardware encoder like NVENC or Intel QuickSync simply allows you to encode better quality video with higher encoding framerate or more efficient compression ratio in realtime, but it doesn’t seriously minimize videocapture related performance hit like many reviewers say.
The main and the real performance penalty for videocapture application is … a simple process of capturing frames from DirectX application. DirectX rendering pipeline is asynchronous, so multiple frames are being processed in parallel by different stages of the pipeline. And that’s where our performance penalty come from – DirectX architecture is simply not built with idea of effective frame readback from the very end of rendering pipeline. So each frame capture via DirectX causes DirectX rendering pipeline to be flushed, CPU may stall until GPU finish flushing the pipeline and provide captured frame back to CPU, etc, etc. This means that each simple frame capture iteration hurt the efficiency of prerendering, may seriously reduce the performance of multi-GPU systems and so on.
And that’s where NVIDIA ShadowPlay magic begins. I’m very supersized to see that all reviewers focus on Kepler’s NVENC hardware H.264 encoder only without even mentioning Kepler’s innovative and unique NVFBC (Frame Buffer Capture) and NVIFR (Inband Frame Readback), which also help NVIDIA ShadowPlay a LOT. Those two are probably even more important videocapture oriented hardware technologies introduced in Kepler GPUs. Both are promoted by NVIDIA as ultra fast low-latency GPU/DMA accelerated framebuffer capture techniques, both are targeted at cloud gaming / GRID systems (you can read more in this presentation). It is some very very “light” kind of NVIDIA’s “Mantle” in frame capture functionality area – NVFBC and NVIFR are available via NVAPI and provide developers very effective way of frame capture via direct access to hardware bypassing the limitations of DirectX API.
So ShadowPlay’s success is built on two key hardware technologies: NVFBC for very effective frame capture and NVENC for very effective frame compression. BTW, NVFBC is a direct reason why ShadowPlay is not capturing video in windowed mode, if you follow the link above, you’ll see that NVFBC is able to grab whole frame buffer only.
And now some really good news: using NVENC is third party video capture applications is currently troublesome due to some strange licensing scheme applied to it by NVIDIA. Hope it will change in future, because competing Intel QuickSync H.264 encoding is freely available to any developer and it does great job. But NVFBC and NVIFR interfaces are available to any NVAPI developer, which means that very effective frame capture can be easily added to ANY existing third party videocapture application right now with minimum development cost. So at least frame capture related bottleneck can be taken out of context on NVIDIA Kepler hardware. Personally I’ve chosen to power videocapture engine of RivaTunerStatisticsServer v5.5.0 by NVIFR, even considering that NVFBC provides a bit faster capture, I won’t like to sacrify windowed videocapture support. And I’m pretty sure that other video capture tools will also get support for those new NVIDIA technologies soon.


Alexey Nicolaychuk aka Unwinder, RivaTuner creator

Last edited by Unwinder; 10-29-2013 at 11:37.
   
Reply With Quote