NVIDIA Working on Tile-based Multi-GPU Rendering Technique Called CFR - Checkered Frame Rendering

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Nov 21, 2019.

  1. Hilbert Hagedoorn

    Hilbert Hagedoorn Don Vito Corleone Staff Member

    Messages:
    48,398
    Likes Received:
    18,592
    GPU:
    AMD | NVIDIA
    Dragam1337 likes this.
  2. Spets

    Spets Guest

    Messages:
    3,500
    Likes Received:
    670
    GPU:
    RTX 4090
    Sounds like it can help run SLI on engines that use information from previous frames like temporal effects.
     
  3. Dragam1337

    Dragam1337 Ancient Guru

    Messages:
    5,535
    Likes Received:
    3,581
    GPU:
    RTX 4090 Gaming OC
    With gpu improvements becoming smaller and smaller, we need SLI more than ever... so this makes me very happy !
     
  4. cryohellinc

    cryohellinc Ancient Guru

    Messages:
    3,535
    Likes Received:
    2,974
    GPU:
    RX 6750XT/ MAC M1
    I think this all is heading towards chiplet design GPU's. In several years most like all of us will use SLI or a technology similar to it in one way or another.
     
    Aekold and Dragam1337 like this.

  5. asturur

    asturur Maha Guru

    Messages:
    1,372
    Likes Received:
    503
    GPU:
    Geforce Gtx 1080TI
    i read that picture as rendering half pixel per frame.... unsure is an improvement.

    Otherwise why writing frame N and N+1?

    i disagree we need sli more than ever. We need a way to go back playing games with a decent amount of money.... And if sli is only for top cards that is not gonna happen.
     
  6. geogan

    geogan Maha Guru

    Messages:
    1,267
    Likes Received:
    468
    GPU:
    4080 Gaming OC
    I wonder how this gets around the problem of most modern game engines not being fully compatible with multi-gpu rendering (because of the way the engines work). I mean this was the reason that SLI support died off in last few years. It was a nightmare for developers and NVidia to try and shoehorn in support in a hacky way which ended up being no better and causing more trouble than it was worth.
    As far as I can tell the only way this will work in future with real multi-gpu "chiplet" type designs... is an actual game engine designed from the beginning to work with multiple GPUs and their own RAM or most likely some form of multi-gpu with *shared* RAM solution (which would make the problems associated with multi-gpu game engine easier - i think the main problem is access to other frames to generate current frame but information is in different VRAM so requires totally inefficient copying across continuously)
    So yes I think multi-GPU chiplet design would have to have a shared VRAM + cache amongst all GPUs - which is what SLI does not have now on seperate GPUs.
     
  7. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,016
    Likes Received:
    7,355
    GPU:
    GTX 1080ti
    interesting that people are only noticing this now.... since its been in the NVAPI reference since august.

    Reports are wrong as usual because it has a full opengl implementation including NVAPI reference values, and Pref exposed to Inspector (its not grouped wit hthe rest of the sli settings)

    enum EValues_OGL_SLI_CFR_MODE
    Enumerator
    OGL_SLI_CFR_MODE_DISABLE
    OGL_SLI_CFR_MODE_ENABLE
    OGL_SLI_CFR_MODE_CLASSIC_SFR
    OGL_SLI_CFR_MODE_NUM_VALUES
    OGL_SLI_CFR_MODE_DEFAULT

    [​IMG]
     
    Last edited: Nov 21, 2019
  8. H83

    H83 Ancient Guru

    Messages:
    5,467
    Likes Received:
    3,003
    GPU:
    XFX Black 6950XT
    Maybe i´m writing something really stupid because i have no expertise on this area but wouldn´t it be better to divide the workload differently between GPUs. Crysis for example, the GPU had to render everything including large portions of the island at a distance, is it possible to have one GPU rendering just the background and the physics and the other one running the rest of the scene??? This way the workload would be divided and the performance would increase. Of course this is a very simplistic approach and there are problems to solve like rendering everything in sync but still i wonder if this would work in real life.

    If my sugestion is really stupid, don´t be afraid to say it guys!
     
  9. DeskStar

    DeskStar Guest

    Messages:
    1,307
    Likes Received:
    229
    GPU:
    EVGA 3080Ti/3090FTW
    Really beg to differ...... Only lazy development from lazy developers showed it wasn't worth it.

    Anything from dice is a scalable dream.... Play valve games and they're amazing. Play a crytek game and it is a dream with more than one card......

    Lazy development is what got multi-card setups the hack.....

    I can still game at 6880x2440 in some games maxed out on my quad sli Titans......

    But sli has been dead because of other garbage.......

    Brand new systems with most powerful card can not run what I run at 120+fps 6880x2440 I know because I just built another one......

    Five year old computer can run "supported" games faster than a sysytem built today.... Facts.
     
    Last edited: Nov 21, 2019
    screwtech02, Mesab67 and Dragam1337 like this.
  10. DeskStar

    DeskStar Guest

    Messages:
    1,307
    Likes Received:
    229
    GPU:
    EVGA 3080Ti/3090FTW
    Kind of makes sense to me. It's the same that they're doing already between the CPU and the GPU as it is. In theory.....
     

  11. Netherwind

    Netherwind Ancient Guru

    Messages:
    8,821
    Likes Received:
    2,402
    GPU:
    GB 4090 Gaming OC
    I welcome this with open arms. Back in the day you bought one card and then another one when the next generation hit the shelves. Both cards combined were more powerful than the new generation flagship card and cheaper too.
     
  12. Denial

    Denial Ancient Guru

    Messages:
    14,206
    Likes Received:
    4,118
    GPU:
    EVGA RTX 3080
    This is basically what the Hydra Engine was by Lucid (https://en.wikipedia.org/wiki/Hydra_Engine)

    There are a bunch of issues with it - for starters a ton of modern shaders in games use interframe data in order improve performance - if this data is sitting on another graphic cards then either the improvement can't be used or there is a massive performance penalty in getting it off that GPU onto the other GPU. Similarly, managing all these different elements as the scene shifts and recombining them in a single frame buffer for output takes time and thus effects performance. Managing the CPU threads to manage both GPUs is a nightmare too because you're essentially spending time before the scene even starts rendering figuring out how to divide the scene to avoid stalls across both GPUs. Then if the different GPUs have different feature sets it becomes even more complicated.. and it's all for what? So that the 25 people with SLI/Xfire can benefit slightly at the expense of everyone else because all the interframe optimization is now gone?

    I think most devs, ones who are even capable of doing this kind of low-level hardware development, look at it and go "it's not even close to being worth it" and that's it.
     
    geogan and schmidtbag like this.
  13. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    7,976
    Likes Received:
    4,342
    GPU:
    Asrock 7700XT
    As far as I'm aware (and maybe I'm wrong), one of the main problems is that GPUs can only render 1 frame at a time. Each "stage" throughout the rendering process doesn't take up an equal amount of resources. So, although I think Nvidia's CFR idea is a good one, what if things were taken a step further, where any idle cores were used to calculate the next frame in parallel?

    Although I don't know how GPUs work at the driver level, here's my very crude estimate of how each frame is rendered. Each "stage" may take up dozens of clock cycles:
    1. The GPU receives, compiles, and parses new frame data to calculate
    2. Calculate physics (if necessary)
    3. Set up the mesh geometry to fit the viewport
    4. Apply textures
    5. Apply lighting effects
    6. Perform ray tracing or calculate reflections
    7. Run post-processing effects
    8. Return any data back to the program that it may be expecting
    Obviously, not all of these stages need the same amount of compute power. Some need fewer GPU cores than others. Some could get by with half-precision floats.

    So, what if idle cores were always used to render the next frame? It's a similar idea to AFR, except instead of splitting up the entire frame rendering process per die, you split up the individual stages of frame rendering between individual cores. This should help reduce latency and maximize untapped resources.
    EDIT: And this is different from H83's idea, which to my understanding, designates certain regions of the frame to be rendered on separate cores.

    Of course, there must be some fundamental flaw in this idea (or, my assumption that only 1 frame is rendered at a time is wrong), or else it'd have been done already.
     
    Last edited: Nov 21, 2019
  14. H83

    H83 Ancient Guru

    Messages:
    5,467
    Likes Received:
    3,003
    GPU:
    XFX Black 6950XT
    Well that´s it then. Thanks for explaining it!
     
    Last edited: Nov 21, 2019
  15. fry178

    fry178 Ancient Guru

    Messages:
    2,067
    Likes Received:
    377
    GPU:
    Aorus 2080S WB
    @Netherwind
    I dont remember what market you in or what cards you referred to, but during the time i worked in shops in germany/US, i have never seen 2 cards being faster than the top one, or they were not cheaper (e.g. two of the 2nd biggest chip), and most games would still not be running as smooth as with a single card (micro stutter) needed a bigger cpu and most of the time a bigger psu (vs biggest chip).
    And if a new gen dropped, virtually all chips of the previous gen dropped in price, not just the smaller ones, so not an argument..
     

  16. wavetrex

    wavetrex Ancient Guru

    Messages:
    2,450
    Likes Received:
    2,547
    GPU:
    TUF 6800XT OC
    I wonder which software does tile-based rendering the best ?...
    Something Cinema 4D, something Blender ...

    Apply the same concept to multi-GPUs and RTX will get a LOT faster !

    One GPU is pretty damn fast these days for classic raster, but it's down on it's knees with RTX ON...
    Answer: RTX ON x2 (and of course $$$ x2, because why not)
     
  17. Netherwind

    Netherwind Ancient Guru

    Messages:
    8,821
    Likes Received:
    2,402
    GPU:
    GB 4090 Gaming OC
    As always, my memory doesn't serve me well so I don't remember exactly which cards I had but I think I rocked two GTX 970 which in perfect scaling were faster than the 1080 (not sure about the 1080Ti). Then something similar with older nVidia cards and even ATi cards.
     
  18. Aekold

    Aekold Active Member

    Messages:
    68
    Likes Received:
    17
    GPU:
    MSI GTX 1080
    Agreed. On the surface, the move to chiplet design seems to be what they're going for. Games have been moving away from fully supporting traditional multiple GPU setups (SLI + Crossfire) for years. I can't imagine SLI, as we know it today, being their end goal. It will probably benefit it, though, regardless. :)
     
    Last edited: Nov 21, 2019
  19. Dragam1337

    Dragam1337 Ancient Guru

    Messages:
    5,535
    Likes Received:
    3,581
    GPU:
    RTX 4090 Gaming OC
    There are plenty of games with near perfect sli scaling - frostbite titles have 97% sli scaling, and g-sync completely eliminates the micro stutter associated with sli.
     
  20. A M D BugBear

    A M D BugBear Ancient Guru

    Messages:
    4,394
    Likes Received:
    631
    GPU:
    4 GTX 970/4-Way Sli
    Correct me if I am wrong here, Didn't ati in the past used similar method in xfire mode?

    I thought they used this once before.
     

Share This Page