Its just the reality of dx10/dx11. No matter what you do, there is only a single thread feeding the GPU. In order to get more efficiency, you have to do "tricks" to get that thread to submit faster, or feed it faster if its idle. So called "multi-threaded" is just multiple threads feeding the single submision thread. In the case of Nvidias increased efficiency, there are two ways to look at it. 1. They found a way to push/feed the submission thread faster. or 2. They found inefficiency their code slowing down submission and corrected it. Take your pick. A couple years ago AMD did some research and found that tiled resources didnt have any benefit whatsoever on their hardware, so they didnt bother implementing it. This leads me to believe its more likely choice number 2, and AMDs code is already highly efficient.