Asynchronous Compute

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by Carfax, Feb 25, 2016.

  1. Alessio1989

    Alessio1989 Maha Guru

    Messages:
    1,454
    Likes Received:
    249
    GPU:
    .
    I do not see why they should expose an implicit d3d12 API feature as a proprietary extension.
     
  2. Yxskaft

    Yxskaft Maha Guru

    Messages:
    1,365
    Likes Received:
    81
    GPU:
    GTX Titan Sli
    NVAPI has been publically confirmed to be used in Frostbite 2 and 3 and one of its main usages was to allow for Nvidia's Tesla GPUs to use the DX10.1 features it did support but couldn't utilize without NVAPI
     
  3. dr_rus

    dr_rus Ancient Guru

    Messages:
    2,984
    Likes Received:
    331
    GPU:
    RTX 2080 OC
    Yeah, and in top 10 four of the cards are NV's which is 40% in this single metric which doesn't account for anything but perf/$ in VR benchmark which does favor GCN cards somewhat. If you don't see how this proves that you're completely wrong when you say that AMD's h/w is beating NV's h/w everywhere but in the top end then I can't really help you any further.

    FL12_1 does matter and it does matter for DX12 on Maxwell specifically because it allows for performance optimizations of algorithms which will run on FL12_0 h/w as well but considerably slower. These algorithms happen to be those which are used in volumetric lightning shaders for example which are very popular right now in most engines.

    The "proprietary things" work and add value to NV cards right now. You may dance all you want around this fact but it won't change at all.

    Your interpretation of this is completely wrong however. It was a conscious choice but it wasn't completely because they wanted to save power (and it most definitely not the sole reason why Maxwell is destroying GCN in power consumption), it was only partially because of that and partially because their SIMD architecture doesn't really need to have something running in parallel to graphics to achieve it's peak performance. The second reason is the most important one because for all we know spending transistors on something like ACEs for Maxwell could've been completely pointless as Maxwell is maxed out by graphics tasks alone and it doesn't need any compute tasks running in the background to show it's peak performance figures.

    No matter how much bold and italic you use this doesn't prove at all that Maxwell is in any way inferior to GCN's scheduling solution as this choice actually made Maxwell a much better option for DX11, for low-to-high power systems and essentially for DX12 as well because from what benchmarks we have at the moment we can safely assume that a Maxwell based chip of the same complexity will likely be on par in DX12 with a GCN chip of same complexity.

    So as I've already said we have a Maxwell arch which is good in all APIs against a GCN arch which is good only in D3D12/Vulkan. That's the result of choices made by both companies when it comes to task scheduling right now. And we're still waiting on NV's drivers with concurrent execution support and for titles which will actually use Maxwell's FL12_1 features to speed up the rendering of some effects. This isn't a clear win for AMD at all at the moment.

    This is about the internal scheduling inside the multiprocessor unit and it has nothing to do with concurrent async compute at all. Could you please stop spreading meaningless nonsense?

    No, they don't admit this here at all. DX12 specs do not require concurrent compute execution from a DX12 capable h/w. Also they are talking about a high priority compute jobs which is something which you don't really want to run asynchronously with graphics at all so this doesn't really say anything about the ability of concurrent compute execution on Maxwell either.

    As I've said it would be really cool if you'd stop spreading nonsense about things you don't fully understand.
     
    Last edited: Feb 27, 2016
  4. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    7,005
    Likes Received:
    139
    GPU:
    Sapphire 7970 Quadrobake
    Just tell me that you are a Trump voter now, so that I can explain the complete inability to read text that you obviously have. I haven't seen more twists in meaning in this forum since forever.

    I quote NVIDIA themselves saying that their scheduling hardware is inferior, but it is a conscious choice, the data we have up to this point confirms it, NVIDIA THEMSELVES SAY NOTHING ABOUT ASYNC COMPUTE, and yet you insist on... what exactly?

    What are you saying? That DX12 should give an equal performance uplift to GCN and Maxwell? That there is scheduling hardware where there isn't? What's your point exactly? And furthermore, how do you support it apart from simply telling me "this is not like that", without quoting a SINGLE source?

    This quote of yours:

    After the presentation that NVIDIA did to Anandtech, where they say that they made the choice CONCIOUSLY and with regard to performance regressions, you say this. You're either a fanboi, a troll, or paid. Unfortunately there are no roads in between left. What is the "internal scheduling inside the multiprocessor" dude? If you understand the words you write, how can it NOT have anything to do with scheduling?
     
    Last edited: Feb 27, 2016

  5. VAlbomb

    VAlbomb Member Guru

    Messages:
    146
    Likes Received:
    4
    GPU:
    Nvidia G1 Gaming GTX 970
    I love that Trump remark.
     
  6. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    7,005
    Likes Received:
    139
    GPU:
    Sapphire 7970 Quadrobake
    I actually like what he's causing to the political scene, even though I don't agree on policy. The judgment for most of his voters stays though.
     
  7. VAlbomb

    VAlbomb Member Guru

    Messages:
    146
    Likes Received:
    4
    GPU:
    Nvidia G1 Gaming GTX 970
    It could be worse, like a Bernie voter per example.
     
  8. Alessio1989

    Alessio1989 Maha Guru

    Messages:
    1,454
    Likes Received:
    249
    GPU:
    .
    Feature level does not coincide with performance. Feature level are simply a well defined set of hardware capabilities. Creating a device with an higher feature level instead of a lower feature level does not magically improve performance on the same rendering path. FL 12_1 guarantees ROVs and CR tier 1. ROVs support lack is the most annoying thing on GCN. CR tier 1 is nice, however the big deal with conservative rasterization starts with tier 2. Currently only Skylake support CR tier 2 (it goes up to tier 3, wich is really nice).
     
    Last edited: Feb 27, 2016
  9. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    7,005
    Likes Received:
    139
    GPU:
    Sapphire 7970 Quadrobake
    I love how you are probably the only person in this thread that knows what he's talking about, and we're all ignoring you. (Like Trump and Bernie voters). :p
     
  10. Alessio1989

    Alessio1989 Maha Guru

    Messages:
    1,454
    Likes Received:
    249
    GPU:
    .

  11. -Tj-

    -Tj- Ancient Guru

    Messages:
    16,505
    Likes Received:
    1,537
    GPU:
    Zotac GTX980Ti OC
    PrMinisterGR

    dr_rus is right in anything he said, you are just basing and nitpicking some random facts just for the sake of it and try to make Maxwell look bad. I saw your thread and "mocking" nv saying 390X and 980TI for new GOW, yes hilarious, lol at MS not at recommended spec, they named 390X because it has 8GB vram, I would laugh at furyX in that case since its not ok for 4K.
    Just shows how much you know about each side.

    And that anandtech is no know it all, seen better articles lately and yet you keep mentioning them over and over again like he's the maker of gpu hw.


    Imo you didn't read enough nvidia whitepapers to see the bigger picture. I suggest you to start with GF110 and move up to GM200..

    And you will see whats under the hood and how it works and no GM200 is fully DX12, actually its DX12.1 with more features which in end result will speed up calculations vs dx12.0.
     
    Last edited: Feb 27, 2016
  12. CrazyBaldhead

    CrazyBaldhead Master Guru

    Messages:
    300
    Likes Received:
    14
    GPU:
    GTX 1070
    Can we all have a nice, informative conversation and can you not bring politics into tech threads? We're already getting threads locked left right and center.
    And it kinda reflects badly on your argument when you have to stray completely off course.
     
  13. Denial

    Denial Ancient Guru

    Messages:
    12,489
    Likes Received:
    1,728
    GPU:
    EVGA 1080Ti
    Where do they admit this? The thing you keep quoting with Nvidia in the title is a guys personal blog. I imagine he knows what he's talking about, but that quote isn't from Nvidia.

    He says right on top that its his best practices guide

    "I provide a best practice guide to tune specifically to each vendors hardware, based on the hardware and driver capabilities found. Further I present an hybrid approach which caters both vendors hardware equally despite the differences."

    And further down the page he even admits that he doesn't know how to mix and match CUDA/DX12, he says ask an engineer. So basically that entire line that states he recommends using CUDA for multi-compute is just him guessing.
     
  14. dr_rus

    dr_rus Ancient Guru

    Messages:
    2,984
    Likes Received:
    331
    GPU:
    RTX 2080 OC
    The only person who's posting constant twists in this thread is you, my friend. And to answer your question I, unfortunately or not, I don't know, can't be a Trump voter because I'm a Russian citizen and live in Moscow. But just for the fun of it not too long ago I've taken a test on some website according to which I should've been a Bernie voter. Hope this helps you somehow.

    You're quoting stuff which you don't understand and all you're doing is just showing everyone how stupid you are. NV is saying a lot about async compute to those who know how to listen. You are obviously not one of them.

    I've elaborated on all my points rather clearly in this thread. If you still can't understand them - that's not my problem.

    Yeah, this was a conscious choice on how thread warps are scheduled to execution units inside multiprocessors. It was made after analyzing the flow of different code on Kepler SM schedulers and seeing that their flexibility in what execution units they can launch a warp on is unnecessary.

    What you clearly don't understand at all because you are quoting this in this thread is that warps inside the SM are scheduled from a thread queue which is fed into the SMs from the GMU/driver. And it doesn't matter at all how the scheduling inside SMs is done because all the thread level scheduling is happening in the global scheduler (GMU/ACEs/what-have-you) and no changes in the SM level scheduling will affect the ability to perform async compute - I mean, they can affect it in a meaning that there will be less/more execution units available in each SM but this is a global performance metric which will affect everything, not specifically async compute.

    So do us all a favor - stop talking about things which you don't understand. And stop trying to combat those who are talking to you with US politics - some of us seriously don't give any **** about it for obvious reason.
     
  15. CrazyGenio

    CrazyGenio Master Guru

    Messages:
    434
    Likes Received:
    29
    GPU:
    Msi rtx 2080ti trio
    have you heard of crosssli setups they run better than sli only or crossfire only

    [​IMG]

    [​IMG]

    [​IMG]

    if you people are worried about dx12 features you can keep your nvidia card and buy an amd and have fun with both lol.

    [​IMG]

    apparently the api gets the best of both cards and make them get along.
     

  16. Keesberenburg

    Keesberenburg Master Guru

    Messages:
    851
    Likes Received:
    24
    GPU:
    EVGA GTX 980 TI sc
    Sorry wrong topic
     
    Last edited: Feb 27, 2016
  17. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    7,005
    Likes Received:
    139
    GPU:
    Sapphire 7970 Quadrobake
    The Trump remark was a joke, sorry if you got it seriously, I'm not a US citizen either.


    It's me and the sum of the technical press apparently. Can you please link me to a single reputable source showing that they know the reason that Maxwell doesn't have effective async compute? One? Please?


    All these words carefully avoid the once sentence from Anandtech (which was obviously given by NVIDIA, since last time I checked Anandtech don't break apart chips and study them with microscopes). The one that says that the scheduler is not as flexible any more.

    After that, you have NVIDIA saying about scheduling and preemption, that you shouldn't be switching between workloads in between draw calls, since it slows things down. Which is the hardware that does that? The scheduler. It seriously doesn't take much to see that. I'm not even "blaming" them for that.

    You're basically telling me that the high level scheduling is being done in the CPU for Maxwell (since by NVIDIA's admission it's not done in the GMU), and then you wonder why Maxwell cards see no benefit from an API that is using CPU resources better. If the NVIDIA driver schedules effectively already, you are getting 100% of your card JUST with DX11, which (I'll say again), is a MIRACLE.

    The difference between us is that I'm not belittling you (unless calling a Russian a Trump voter can be taken seriously), and I'm not invested in ANY manufacturer. My card is ancient and I'll get one once both of them have their 16/14nm stuff out.
     
  18. fantaskarsef

    fantaskarsef Ancient Guru

    Messages:
    11,138
    Likes Received:
    3,221
    GPU:
    2080Ti @h2o
    Yeah, noticed WORKING cross vendor multi GPUs too in the other thread. Nobody seemed to care... sadly.
     
  19. CrazyGenio

    CrazyGenio Master Guru

    Messages:
    434
    Likes Received:
    29
    GPU:
    Msi rtx 2080ti trio
    because they want only to workship a single hardware company, that's the level of fanboys we have on pc master race.
     
  20. EdKiefer

    EdKiefer Ancient Guru

    Messages:
    2,362
    Likes Received:
    194
    GPU:
    MSI 970 Gaming 4G
    Well IMO there few issues with mutli vender gpu usage .
    1) The two cards need to be same speed so that will limit what people have around .If you have to by 2 cards, might as well get ones that support SLI/CF .

    2) You now need both vender drivers installed on 1 system, vid drivers don't really work great with multiple drivers, other than going back to 3Dfx (Voodoo2/2d primary).
     

Share This Page