Why is it that ATi's new 3870 series completely whip Nvidia in specifications, but they just can't seem to beat Nvidia's 8800 Ultra? Is it a driver issue with ATi or is their hardware design? Nvidia has 128 unified processors; ATi has 320. Nvidia's core clock is 612; ATi ranges in the 700's. Nvidia's mem clock is 2000MHz; ATi's is 2400MHz. Why isn't ATi winning with these cards? They should be.
It's a little of both. Another issue is that AMDTi still lags behind Nvidia in Developer Relations. Nvidia simply has better DevRel. Don't let the LISTED specs fool you. Both architectures are vastly different. The main core speed of Nvidia's cards may be slower than their AMDTi counterparts, but their Shader Clocks are much much higher. The current AMDTi architectures are much more dependant on Shader Math dependencies than the simpler Scalar architecture of Nvidia's. This means that AMDTi cards require more optimizations in their drivers, compilers, and even shaders to perform near peak. Also the fact that they're also behind their Nvidia counter parts in Texturing/Filtering performance doesn't help matters much.
The first thing to be noted is that in theory a Radeon HD 3870 cannot outperform a GeForce 8800 Ultra. They can, however, compete with the GTX cards in the best case. Now, as to why the performance of the HD 2900 and HD 3800 cards is lacking: 1) VLIW (Very Long Instruction Word) architecture of the stream processors. ATI has an architecture with 64 shader processors that can perform 5 operations per second optimally, this leads to a total of 320 "stream processors". NVIDIA's G80 has a scalar design with 128 processors which do exactly as advertised in most cases (=128 operations, 256 in some very specific cases). If the shader is not optimized enough and does not map well to the Radeon architecture, the 5 sub-units within each shader processor are not fully utilized, due to which the performance of the architecture is reduced. The compiler of the driver has to therefore work very hard to re-order and re-arrange the shader instructions such that all 5 sub-units (or at least as many as possible) are active. 2) Texture operations. The Radeon HD 3800 and 2900 lacks ROP power significantly as compared to their NVIDIA counterparts. This hurts them in situations that are fillrate limited or texture bound. I have a feeling this also hurts them during scenarios where lots of AA is needed, but I cannot confirm this. 3) The R6xx family lacks hardware AA resolve, which their NVIDIA counterparts can do. This functionality is instead done via the shader processors, which costs performance. This, combined with the texture operations deficiency, is probably the biggest problem of the entire R6xx family. 4) NVIDIA, having brought out DX10 compliant hardware as well as having a competent and good developer relations team, has the benefit of having most developers optimize for their hardware. With proper optimization of shader code, a Radeon HD 3870 can perform on par with a GeForce 8800 GTX in shader-limited scenarios. However, this just isn't happening. Better drivers, with a better shader compiler is the only way one can hope to see improvements. Hopefully improvements will be coming....
yeah with better drivers...i doubt it...i don't know but all the drivers released by ati aren't doing there job...this last should have brought 15% improvement in dx10 crysis...well it doesn't, 35% in lost planet dx9...it doesn't...and this two improvements are on all radeon graphics cards...i feel sorry for them considering the drivers for the x19xx where outstanding..and the cards performed at full potential when they appeared...and i doubt that they will even bother to improve performance on the hd29xx and hd38xx...i just want that there next release hd4xxx to bring what this two series didn't.
I think its architechture fault. Same way as X1600-series had only 4 pixel lines but 3 ROPS per line making it 12 Pixel lines, while nVidia had 12 pixel lines physically. This meant Ati in some games could utilize only 4 with Ati when nVidia could utilize all 12. I think in current thing has same there. There is only 64 Shader units with 5 ALU's to treat them like 320. So basically there isn't 320 physical units but 64 which each has 5 ALUS to treat it like 320 units. Like AcceleratorX said I think Ati should half the Alu's number and double the Shader units. Then it would kick ass.
As such, performance improvements are indeed coming with new drivers, only that its not as much as expected. I still have hope ATI will be able to squeeze a little bit more from newer drivers, but lets see.
HD 3xxx- was die shrink generation. And as such surprisingly good. Let's see if HD 4xxx will see some other changes aswell.
The HD3800 series is 10.1 compliant all the way so Driver updates might indeed bring out better performance as the new standard will allow developers to optimize for either card relatively the same since both will be strapped into a pretty strict format.
don't forget that nVidia's Shaders are at 1600-2000Mhz where ATI has clocks of 775Mhz for the shaders. That alone makes nVidias shader power pretty high.
Let's not look that far ahead, but rather work with what we have now. ATI needs to work on the current tech. The have no business jumping ahead without making what we have work now. If they don't optimize the 900 bucks I dropped on these two cards, I am never using anything from ATI/AMD again. I am angry at the trend of theirs releasing incomplete products. I really regret having a mobo that only supports crossfire too. I currently feel like I've been burned, with exception to the fact that this intel processor and board are just awesome.
I wonder if this is something we could change in BIOS, or if it would require a separate clock crystal for that. hmmmm..... I have not been able to make much sense of the BIOS files as most of them are garbled. If anyone knows what could be used to read what's in the files, let me know. I would definitely appreciate that!
You disproved your own theory there in 2) with 3). Since the R600 lacks hardware resolve antialiasing through the ROPs. Since it does AA utilising the shader ALUs, there shouldn't be any ROP bottlenecking at higher anti-aliasing levels since the ROP hardware isn't being utilised to do the AA. Unfortunately you're only taking the basic specs at face value there. The shaders on the nvidia cards are clocked independently of the core (or at least as a product of the core) in which that they are actually alot higher than he core frequency. 8800 cards range from 1100-1600mhz on the shader core clocks depending on the card you're using, and looking at the 9800GTX rumour, the clocks on the shaders sits at about 2150mhz. In short, the ATi card's problem is in it's hardware design, in which that it's hard to fully utilise. Look up R600 architecture on google and have a read. I have some experience modifying the BIOS file to set different clocks and voltages. Unfortunately what you speak of is a bit of the impossible. Realistically the shader cores can only handle around 800mhz on stock volts and will never really be able to reach 1300mhz like the 8800s. It's just a different architecture that can't handle those sort of clocks, but instead has alot more workers to do the jobs. The values of HEX you see in the BIOS files represent characters, and even with the file translated to characters it's still rather unreadable, especially seeing as the BIOS files are in little endian format meaning alot of the useful numbers you'd be looking for are backwards or in some hex editors, mis-translated to characters. My advice is stay clear, and trust me, if it were possible to ramp the shader clocks up to 1300mhz easily, it'd have been done a long time ago.
No, just something to read it with. Right now I have nothing to effectively read it. I have tried a few things but nothing with any major success. If I was to change something without knowing what it was, I could very well turn my cards into excellent paperweight conversation pieces. I'm not down with that.
lol true! thus the logic behind my not screwing with what I cannot understand. Yes, I have also messed with the BIOS files through an editor, but that is the full extent of it. hehe Is it okay to say i am not satisfied with that?:nerd: What I want to do is find out who created this editor, so I can find out how they knew what to do and what this code is. I am not easily discouraged, but through this I am never at a lack of better judgement. Expensive cards that work are a good thing. Expensive cards that don't work due to an idiot screwing up the BIOS is a crime. Your analogy pertaining to the shaders is a good one. And it is true that the ATI shader architecture does not necessarily need to run at a higher speed than the rest of the chip to do the same thing...."with many more workers to do the job" Either way, I am not really concerned with the inefficiencies of the shader control, but rather the inferior load sharing and power, and thermal management of the card external to the gpus. It's the equivalent of putting 3000 dollars in wheels on a 200 dollar 1979 oldsmobile. The voltage regulation circuitry, BIOS, and cooling are absolutely terrible and if done properly could have really served to compensate or even hide this software inefficiency with better speeds and stability. Have you taken a 3870x2 apart yet? OMG what a joke! I am not giving up on these cards just yet though. My second 3870x2 arrives tomorrow to replace the one I had with a bad fan, and I plan to do a lot of tinkering.
You can NOT have any R(V)6xx card to have or report independent shader clocks. Thats a hardware design (ATI decided not to go down that route), and it does take more transistors as well to clock the shader core higher than the rest of the GPU. I would rather prefer everything to run at the same speed, but thats just my old Orthodox view