ATI (the graphics division of AMD) first released its new RV770 graphics processor back in June with the introduction of the Radeon HD 4850 and Radeon HD 4870. Both of these are excellent products—they’re some of the best graphics cards we’ve seen in a long time, and deliver so much bang-for-the-buck that Nvidia has had to drastically reduce prices on its products to remain competitive, a move which Wall Street didn’t like very much.
Whatever you may say about the Radeon 4800 series, this much is true: ATI was competing in the mid-range and performance graphics segments, with no product for the truly high-end (defined by “$400 and up”). At the time of the 4800 series launch, ATI laid out its new strategy: no more really big GPUs designed only to be used in those high-end products. Instead, it will stick two modestly-sized GPUs on a single card, similar to the Radeon HD 3870 X2 to address that market.
There’s nothing inherently good or bad about this strategy. It is a series of tradeoffs like any other. ATI has been working hard on their multi-GPU scaling and compatibility, with nice improvements in both areas delivered in drivers earlier this year. Now that the Radeon HD 4870 X2 is finally here, it’s time to put ATI’s new strategy to the test. Is two RV770 GPUs on a single card the best way to cater to the high-end market? Continued…
Speeds and Feeds, Dual-GPU Tradeoffs
Here’s a look at some of the specs of current high-end graphics cards. This is Nvidia’s top two GPUs, and the top two from ATI.
|GeForce GTX 280||GeForce GTX 260||Radeon HD 4870 X2||Radeon HD 4870|
|GPU||GT 200||GT 200||RV770 x2||RV770|
|Transistor Count||1.4 B||1.4 B||1.9 B||956 M|
|Core Clock||602 MHz||576 MHz||750 MHz||750 MHz|
|Stream Processor Clock||1.29 GHz||1.24 GHz||750 MHz||750 MHz|
|Memory Clock||2.2 GHz DDR||2.0 GHz DDR||3.6 GHz DDR||3.6 GHz DDR|
|Render back end (ROPs)||32||28||32||16|
|Frame Buffer||1024 MB||896 MB||2 GB||512MB|
|Memory Interface||512 bits||448 bits||2x 256 bits||256 bits|
|Memory Bandwidth||141.7 GB/sec||111.9 GB/sec||230.4 GB/sec||115.2 GB/sec|
Those are some mighty impressive numbers in the 4870 X2 column, but that’s just because it’s really two 4870s crammed onto a single board. In fact, it’s two 1GB 4870s, for a grand total of 2 gigs of graphics memory. Note ATI’s aggressive pricing: It’s more expensive than the GeForce GTX 280, but it’s less than twice the price of a Radeon HD 4870, despite having four times the GDDR5 memory.
Addressing the high-end of the market with a dual-GPU board carries with it a series of tradeoffs—positive and negative aspects that are inherent to the process in today’s graphics landscape. Here’s a short list of pros and cons of having two GPUs on a stick approach:
- It’s cheaper to put two small GPUs on a card than on one big one. Defects increase at a non-linear rate when chip size increases, and economies of scale kick in when you can address a bigger range of products with the same chip.
- It’s easier to cool two smaller chips than one big one. Heat production is spread out to two areas.
- Whenever your driver team makes one of your single-GPU cards go faster, odds are that it’ll make the dual-GPU card faster. Work to improve dual-GPU scaling also pays off. In short, there are more avenues of attack for software optimization.
- Two chips with their own memory controllers is an economical way to achieve really high memory bandwidth.
- You get the performance of two GPUs, but don’t need a CrossFire/SLI capable motherboard. Just one PCIe x16 slot is all you need.
- Not every game scales well to multiple GPUs, so in some titles you pay a lot more but don’t get much more performance.
- Twice as much memory is needed on the card because most graphics data has to be duplicated in the memory attached to each GPU.
- Multi-GPU rendering doesn’t work in a window, so all your players that prefer windowed mode get the performance of a single GPU even though you paid for two.
- Multi-monitor setups don’t like multi-GPU rendering, either. ATI manages this better than Nvidia by shutting off the secondary display and running dual-GPU on the primary, but it’s still less than ideal. Especially if you want to run the game on both monitors at once.
- Dual-GPU cards tend to have higher power utilization, both at idle and under load.
- Dual-GPU cards are almost always long (10.5-inch for the 4870 X2) and won’t fit in smaller PC cases.
The above tradeoffs are true for both Nvidia and ATI dual-GPU products, and it’s true of two-card solutions, too. The RV770 GPU includes a “CrossFire X Sideport” that makes for more efficient communication between GPUs, and which ATI says should help improve scaling from one GPU to two. But neither ATI nor Nvidia has really come up with solutions to the common dual-GPU problems.
If “two GPUs on a stick” is going to be the way ATI addresses the high end of the market from now on (and the company confirms that is the case), it needs to work hard on three major areas in addition to the ongoing battle to improve performance scaling:
- Make multi-GPU rendering work just as well as a single GPU in windowed mode and on multi-monitor setups.
- Allow each GPU to read from the other’s memory efficiently, so graphics data doesn’t have to be duplicated in each memory bank and twice as much memory isn’t needed on a dual-GPU card.
- Reduce power consumption, especially at idle, to be in line with single large GPUs.
Now that we know what the Radeon HD 4870 X2 is and what it can and can’t do, it’s time to see where the rubber meets the road: performance.
We’ll compare the Radeon HD 4870 X2 with Nvidia’s top-of-the-line graphics card, the GeForce GTX 280, as well as with a single Radeon HD 4870 to determine scaling from one GPU to two. Since the clock speeds are the same as a single GPU card, we’ll get a very good idea at how well ATI’s multi-GPU scaling work has paid off. Note that the GTX 280 has been reduced in price greatly since its launch; down $200 from $650 to $450. That should make the price/performance comparison quite interesting.
We’ll be testing the EAH4870X2 from ASUS, which basically adheres to the ATI reference design for the Radeon HD 4870 X2.
|Processor||Intel Core2 Extreme X9650 (3 GHz)|
|Motherboard and chipset||ASUS P5E3 Premium Intel X38 Chipset)|
|Memory||2 x 1GB DDR3 1333MHz|
|Hard drive||Seagate 7200.10 160GB SATA Drive|
|Optical drive||ATAPI DVD-ROM Drive|
|Audio||Integrated HD Audio|
|Operating system||Windows Vista Ultimate with SP1|
We used ForceWare 177.41 drivers from Nvidia to test the GeForce GTX 280 card, and Catalyst 8.7 to test the Radeon HD 4870. The 4870 X2 isn’t supported by any public driver just yet, so we used prerelease drivers supplied by ATI. We’re primarily concerned with really high resolutions here—you don’t buy a $450 or $550 graphics card to run at 1440×900.
Company of Heroes— Relic’s superb World War 2 RTS gets a refresh with the release of the Opposing Fronts Expansion. The benchmark hasn’t really changed, however, and consists of an in-engine cut scene. We ran tests using only the DX10 mode.
Supreme Commander—This DX9-only RTS offers up huge explosions, hundreds of units and large maps. The benchmark features the game actually playing itself as three factions vie for control of a large island.
World in Conflict—Massive’s new RTS offers up some of the best graphics of any RTS game, including some incredible explosion effects. The benchmark features two factions taking on each other for the control of a small town.
Enemy Territory: Quake Wars—The latest multiplayer frenzy from id and Splash Damage feature humans and Strog vying for control of Earth, and is rare among popular PC games in that it uses OpenGL. We use a custom recorded timedemo for our benchmark.
Unreal Tournament 3—This latest update to Unreal Tournament features better single player, but the action is focused on multiplayer. We use a botmatch based on the Serenity map, which runs the new Warfare game type to test performance.
Crysis. Crytek’s new first person shooter is one of the most strenuous games around. We’ll test in DX10 mode, using the game with the 1.2 patch, using the island flyby benchmark test.