dunno, im not a driver programmer. just because the threads are busy, doesnt mean they are doing any actual work. kind of like government employees... lol only thing i can say is it doesnt matter how much work you throw at the submission thread, it can only go as fast as it can go. Reducing the time (latency) it takes the batches to get accross the driver layer is the ony real option to increasing batch flow. Which im sure is what nvidia has done.