NVIDIA Announces 24GB Quadro M6000
by Ryan Smith on March 22, 2016 9:00 AM ESTWith NVIDIA currently between GPU generations, things have been relatively quiet on the professional graphics front for the company. On the high-end NVIDIA released the Quadro M6000 back in 2015, bringing their fully enabled GM200 GPU into the professional market. Now just over a year later, they are giving the Quadro a refresh with a newer, higher capacity model.
NVIDIA Quadro Specification Comparison | ||||||
M6000 (24GB) | M6000 (12GB) | K6000 | 6000 | |||
CUDA Cores | 3072 | 3072 | 2880 | 448 | ||
Texture Units | 192 | 192 | 240 | 56 | ||
ROPs | 96 | 96 | 48 | 48 | ||
Core Clock | N/A | N/A | 900MHz | 574MHz | ||
Boost Clock | ~1140MHz | ~1140MHz | N/A | N/A | ||
Memory Clock | 6.6Gbps GDDR5 | 6.6Gbps GDDR5 | 6Gbps GDDR5 | 3Gbps GDDR5 | ||
Memory Bus Width | 384-bit | 384-bit | 384-bit | 384-bit | ||
VRAM | 24GB | 12GB | 12GB | 6GB | ||
FP64 | 1/32 FP32 | 1/32 FP32 | 1/3 FP32 | 1/2 FP32 | ||
TDP | 250W | 250W | 225W | 204W | ||
GPU | GM200 | GM200 | GK110 | GF110 | ||
Architecture | Maxwell 2 | Maxwell 2 | Kepler | Fermi | ||
Transistor Count | 8B | 8B | 7.1B | 3B | ||
Manufacturing Process | TSMC 28nm | TSMC 28nm | TSMC 28nm | TSMC 40nm | ||
Launch Date | 03/22/2016 | 03/19/2015 | 07/23/2013 | N/A | ||
Launch Price (MSRP) | $5000 | $5000 | $5000 | $5000 |
When the original Quadro M6000 was launched, NVIDIA outfitted it with 12GB of VRAM in a 24x4Gb configuration, a large amount of memory for the time but not the full amount a GM200 card could be equipped with. Now this week the company is giving the card mid-cycle upgrade by increasing its VRAM capacity, replacing the 12GB model with a 24GB model utilizing higher density 8GB GDDR5 memory chips.
The target market for the 24GB M6000 is relatively straightforward: certain segments of the professional visualization market need all of the VRAM they can get, so for NVIDIA ecosystem users this should be a welcome upgrade. At the same time since 8Gb GDDR5 has been on the market for some time now, I’m surprised it has taken NVIDIA this long to bring GM200 to its maximum 24GB capacity. None the less this does give NVIDIA bragging rights as the highest capacity professional graphics card – surpassing the 16GB FirePro W9100 – though it’s worth noting that AMD should have the capability to push that to 32GB if they want final bragging rights.
Meanwhile NVIDIA’s press materials also briefly note that the updated Quadro M6000 ships with some new temperature & clockspeed management options – presumably via a newer firmware – though details are limited. The new M6000 features "More discrete GPU clock options for a better customer experience when running their application" and "Greater software temperature control to keep the GPU temperature below the hardware slowdown threshold for the best user experience.” NVIDIA’s professional cards (Quadro & Tesla) feature more performance controls than we see on consumer cards (which just run as fast as they can) and from the description I expect that NVIDIA has put in some new, finer grained options to better control automatic throttling behavior by manually setting both the maximum clockspeed and temperature. For single card workstations this is rarely an issue, but for large arrays of cards (e.g. Quadro VCA), keeping all of the cards in lockstep with regards to performance is a desired feature.
Finally, since this is a mid-cycle refresh, the new 24GB Quadro M6000 will be launching this week. It will be a drop-in replacement in NVIDIA’s product stack, and will occupy the previous M6000’s spot at $5000.
16 Comments
View All Comments
carnachion - Tuesday, March 22, 2016 - link
Where are the news Teslas!!ImSpartacus - Tuesday, March 22, 2016 - link
Apparently arriving for a while (in meaningful quantities), else this update wouldn't be necessary.damianrobertjones - Tuesday, March 22, 2016 - link
Does Tesla work for Anandtech?nathanddrews - Tuesday, March 22, 2016 - link
Love seeing GPUs with tons of RAM, even if I likely won't be using it anytime soon. The Sony quote... 10x performance boost... compared to what?zoxo - Tuesday, March 22, 2016 - link
If you are memory size limited, it's reasonable to assume a very large performance gainyannigr2 - Tuesday, March 22, 2016 - link
Looking at the Angry Birds movie trailer, I would say they are more "good scenario" limited.Ian Cutress - Tuesday, March 22, 2016 - link
If it's the difference between finding a memory element in VRAM compared to spinning out to disk/DRAM and doing a PCIe transfer while that warp is idle, then 10x is conservative. Ideally warps with information at hand would jump ahead, but you still end with some async kernel waiting on data at one point. Depending on how an algorithm is run.I think the standard taught methodology with CUDA is that for every memory access you need 24-30 FLOPs per DRAM read/write access to get peak performance. If you have to go outside VRAM for that cache line, then the algorithm better iterate over it's own data to keep on spinning to maintain high perf.
nathanddrews - Tuesday, March 22, 2016 - link
That makes sense. Assuming a 4K render path for movies (Sony has been pushing 4K for a while now), I can see them benefiting from the increase. Even in my own amateur experience with doing 4K effects in AE, it consumes all 32GB of my system memory in a heartbeat.kefkiroth - Tuesday, March 22, 2016 - link
This graphics card has more memory than some phones have storage.Eden-K121D - Tuesday, March 22, 2016 - link
I saw what you did there *cough* Apple *cough*