Building a workstation for deep learning is not an easy task and requires quite some know-how. But before getting into the details our suggestion is, if you can, work with an experienced partner. Getting fans, case, positions of pieces, cable management right is not easy task. For an initial assembly we work with Brentford AG in Switzerland. They are great and deliver really well built workstation for AI. Note that what we are describing here is not a cheap deep learning workstation but a high end one, just below server level. Let’s first start with the graphic card.
NVIDIA QUADRO RTX A6000
If you plan to use deep learning framework under linux your choice is basically only NVIDIA. NVIDIA RTX A6000 is the most powerful workstation GPU NVIDIA offering high performance real-time ray tracing, AI-accelerated compute, and professional graphics rendering. The NVIDIA Ampere architecture-based CUDA cores bring up to 2X the single-precision floating point (FP32) throughput. Some of the new features that make this card so great are
- Tensor Cores purpose-built for deep learning matrix arithmetic
- Built with 48GB GDDR6 memory (needed for large datasets)
- Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times
- New Ampere Architecture
- 309.7 TFLOPS Tensor Performance
- 10752 Cuda Cores
- New Tensor Float 32 (TF32) precision provides up to 5X the training throughput
Note that this card is not cheap and it may cost in the order of 5000 USD. So is not a cheap card to buy. It is important to note that to drive this card with so much memory the workstation should be powerful enough, and that will make the costs go up. It does not make any sense to buy this card and use it on a slow system. At the moment of writing this card is still available in small numbers (1-2) from several providers here in Switzerland. In other countries this may be different. Note also that this card takes two PCIe slots, so this is something that needs to be considered when choosing the motherboard.
The NVIDIA RTX A6000 is like a luxus product. The packaging is beatiful.
Another important aspect is that this card supports PCI Express generation 4.0, meaning it provides 2X the bandwidth of PCIe Gen 3.0. Making transfering data from CPU to GPU much faster. Something else to keep in mind when choosing the mother board.
Another very cool feature is the support for a new datatype: Tensor Float 32 (TF-32, something for another blogpost), that can deliver much faster matrix multiplication times in many cases. We have tested this and we have been able to reproduce an impressive 20% performance improvement in very easy cases. The latest TensorFlow release (at the moment of writing 2.5) supports TF-32 on the A6000 and the A100 NVIDIA GPUs.
We have already one A6000 in our workstation, and two additional will arrive after summer. The goal is to have 4 of them by next year, depending on availability. The motherboard that was chosen is the ASUS WS X299 Sage/10G. The most important features for a deep learning workstations are support for four dual-slot graphics cards. You need that if your goal is to use four. This motherboard has lots of features that are great, including up to 256 Gb memory. For our workstation we decided to go with 128 Gb DDR4-RAM, 3200 MHz in 4 slots each of 32 Gb.
The motherboard, in particular the PCIe slots for the graphic cards. Although it may seems that the CPU cooler gets in the way, the placement is perfect and placing the cards is perfect.
You can see the CPU cooler (a bit one) and on the right two RAM modules. The other two are on the other side of the CPU.
Case and Power Supply
The case is important for the cooling possibility, and a big one helps in getting good airflow. The one we use is the Thermaltake Core X71 Full Tower, black. Is a big case that we suggest NOT to put under a table (it fits under a table just by a couple of cm), since the top has a grid and is good to leave it open for air flow.
As you can see the case is quite big, but it is extremely important, especially when you have 4 GPUs in it, as the heat will need to be dissipated and airflow is fundamental.
The power supply is important, as each A6000 may reach (peak) 300W, so if you have 4 you need, just for the GPUs 1200 W (a lot). So you need a 1500W power supply for the system to be sure that the graphic cards gets enough power. The one we use is the
The power supply has lot of space and is placed at the bottom of the case where there is space for additional HDs.
Now for the CPU we settled for the Intel Core i9 10980XE, 3.0 – 4.6 GHz, 18 Core. Note that this CPU is quite powerful but also expensive as it can cost you around 1000 USD. It has 18 cores, 36 threads, and a thermal design power (TDP) rating of 165 watts. The Core i9-10980XE is built on Intel’s Cascade Lake microarchitecture (here dubbed “Cascade Lake-X”), a 14-nanometer production process. It supports up to 256 Gb RAM and up to 48 PCI express lanes!
List of Components
Ok enough talk, here is the list of components of our deep learning workstation in case you find it useful. In the list there are parts that are not included as fans (we have a few in the case).
– Intel Core i9 10980XE, 3.0 – 4.6 GHz, 18 Core
– ASUS WS X299 Sage/10G
– 128 GB DDR4-RAM, 3200 MHz, 4 x 32 GB
– 2 TB Samsung 970 EVO PLUS M.2 SSD, PCIe Gen 3
– 1 TB Samsung 980 M.2 SSD, PCIe Gen 3
– Thermaltake Core X71 Full Tower, schwarz
– 1500 Watt power supply from BeQuiet
– NVIDIA RTX A6000 48 Gb Memory