Discussion in 'Videocards - NVIDIA GeForce' started by NeoEnigma, Oct 16, 2011.
How can people see a lost post? Maybe you should find it.
Once you've finished playing spot the typo, can you at least agree I have managed to prove my point beyond doubt?
I can't believe this is still going. The memory is mirrored it's as simple as that, the cards can not load more memory until 'both cards' need it and as they take turns in rendering the scene the amount of time it takes to use that memory (read memory bandwidth) doesnot change either.
It seems no matter how much common sense, illustrative examples, basic reasoning it used some epeople will not get it it as they simply want to prove they are right no matter if it means confusing the hell out of everyother person reading the thread the phillosophy seems to be If you can't win cheat...
As for anyone wonderong why there isn't so much information regarding this very matter on the net, simple, anyone delving deep enough into this would have realised, after finding that memory is mirrored, that memory bandwidth must also fall to this limitation. I also would think card manufacturers don't like to advertise this eitehr as the well known and document fact that memory is mirrored is also seldom mentioned.
as for the texture fillrate test, it is run at a low resolution in vantage's performance test, something like 720p, so the 'size' of the textures is not too big to not be able to fit into the memory bandwidth of a single card and so the data being displayed on screen is rendered twice as fast just like fps increases in games with sli, but if you were to increase the size of the textures by running it at a very high resolution then the memory bandwidth would become saturated too.
ps. the texture size may even be fixed to a very small size anyway according to vantage's test description.
I thought we had already covered this? detention for you I think.
he gets it!. qft:nerd:
and here's how the texture fillrate test works:
The gurus are right when using SLI in games and benches it is mirrored whatever 'extra bandwidth' is just the same information mirrored in the VRAM of the 2nd card. You can actually see this behavior by simply watching GPU and memory usage in an app like Afterburner. The memory values are always the same using SLI. When using SLI and AFR or AFR2 the scenes are identical or it wouldn't work it would be impossible.
Your example of folding @ home is not relevant. If you're folding on GPU2 that does not mean you are using SLI it means you are folding on GPU2 lol. You don't need SLI to have multiple GPUs or video cards.. it is just a method of utilizing multiple video cards to increase performance in applications that can benefit.
I've used SLI since the 7800GTX so that is where my info comes from along with much research into SLI
Thanks for telling me Im awesome.
And no, you are wrong. Get over it and move on.
Jesus christ this is like happy hour at the asylum.
So you admit there is extra bandwidth?
I know the ram is mirrored, but bandwidth is how fast that ram can be read and written to complete the job, which in this case is complete the frame on your pcs display. Bandwidth is measured in GB/s. That means how fast 2 or 1 cards can read and write data from their ram per second.
The ram is mirrored, both cards must have exactly the same textures, geometry etc in their ram and all the files MUST be at the same address on each card so the CPU can address the GPU drivers when telling them how to build a scene. The vast majority of these textures etc will be loaded at the start of a level and remain static, they will never be changed and they will stay at the same address, until the next level, where new stuff will be loaded. A minority of these assets will be dynamic and need to be changed on the fly with means both GPUs will have to communicate with each other which may introduce a bit of overhead, but this can vary from next to nothing to a lot depending on the title.
Now here's where everybody is struggling:
The CPU tells the graphics driver to draw frames 1,2,3,4, etc etc. In a single gpu system the GPU draws each frame in order, pulling only the data thats needed(Not all of the data will be needed.) from video ram to draw the frame its working on, and then stores it in a buffer ready to be displayed. Once that is completed it will process the next frame.
In a multi-gpu (SLI / Crossfire.) system the cpu tells the driver to draw 1,2,3,4, etc etc as before but this time the driver tells GPU 1 to draw frames 1, whilst that is being worked on the driver tells GPU 2 to draw frame 2, GPU1 then does frame 3, whilst 2 is being worked on, and so forth GPU2 will do frame 4 etc etc etc.
So each GPU will need different data from video ram to complete each frame as every frame will be different and a smart developer program the game to only render what can be seen not the entire level every time for performance. Therefore a SLI / Crossfire system can read and write twice the amount of video ram as a single gpu, therefore twice the bandwidth.
For whatever reason the guys seem to think the GPU's load ALL the data in vram every frame and at the same time so therefore the amount that can be read and written to vram stays the same.
I'm saying each gpu will only read and write the data required to complete the current frame. (Not all the data stored in vram, just whats required to complete the current frame.) and they will do this independently of each other, so therefore twice the amount can effectively be read and written to vram. ( I realise there will be varying amounts of overhead depending on how much inter-gpu communication there has to be.)
I know how it works, you clearly don't.
A screen sized simple rectangle is filled by the gpu(s) with tiny simple FP16 textures from video ram and moved every frame to avoid driver level optimisations millions of times a second.
Geez, I wonder what that tests?
Likely that is because that is exactly how Nvidia describes it. I even cited the official source for that. Apparently you can't read or you didn't read it, and you cannot admit you're wrong. 'You don't know how it works, clearly you don't'. I'm sure you'll try another round of BS that doesn't apply to SLI.
No you havent, you provided a doc explaining how writing to dynamic assets can introduce a performance penalty, the same doc then details best practices to avoid that penalty.
Have you posted something else I've missed?
You've seemed to have missed a lot...
Like what, point me to one provided bit of evidence that confirms what the guys are saying.
Ha, the irony...
There is no irony in this thread? Do you know what it means?
OK, I would love to hear the explanation from you guys as to why a 4870x2 manages nearly twice the fillrate of a single 4870 whilst only having the same bandwidth of the single card.
Ha, I wouldn't expect you to know what it is, just like everything else.
Save yourself the embarassment and stop posting, this is getting sad
more on topic
yeah a 580 would improve your system as long as your cpu is clocked to 4ghz and it's probably the last upgrade you can resonably do before your CPU become a bottleneck. (you may even check for a used 480 SOC or the like as it would do just as fine for a smaller price )
you can expect above 8K on 3Dmark11 with that setup, take it for what it worth but could give you an idea of the upgrade from your actual graphic card and see if it worth it.
now back to the fight !!! (who want some pop corn ??)
Ironically, the only thing ironic in this thread is you posting to say there was irony when there wasn't........
So hows about an answer to my previous question from the self proclaimed grammatical genius?
How does 580 SLI score double that of a single 580 in 3marks colour fill test in the below link:
Bearing in mind the author states:
How wider the bus is, the more bandwidth you have. Narrower bus will result is less bandwidth and bottlenecks depending on the data throughput. A buffer also hold the data then release as soon as the next component can handle it. This is a fact and rule to remember.