Introduction

The new AMPERE architecture on which the new NVIDIA RTX A6000 is based is incredibly fast. With the 10752 CUDA cores and 48 Gb Memory it allows to run TensorFlow models extremely efficiently. In this post we will look at how fast the GPU really is, with a benchmarking code that you can use on your machine to compare it with the A6000 (link to GitHub below). The system on which we tested the RTX A6000 has the following components

  • Intel Core i9 10980XE, 3.0 – 4.6 GHz, 18 Core 
  • ASUS WS X299 Sage/10G
  • 128 GB DDR4-RAM, 3200 MHz, 4 x 32 GB
  • 2 TB Samsung 970 EVO PLUS M.2 SSD, PCIe Gen 3
  • 1 TB Samsung 980 M.2 SSD, PCIe Gen 3
  • Thermaltake Core X71 Full Tower, schwarz
  • 1500 Watt power supply from BeQuiet
  • NVIDIA RTX A6000 48 Gb Memory

The code is on the main 2 TB hard disk. For the benchmarking we loaded two famous large networks, VGG19 and resent50, initialised them with random weights and trained them for 1 epoch on CIFAR-10 data. We used TensorFlow 2.5, CUDA Version 11.3 and NVIDIA Driver 465.19.01 installed on Ubuntu 20.04.

To load the networks we used the following TF Keras code:

VGG19(include_top=True,weights=None,input_shape=(32,32,3),classes=y_train.shape[1])
ResNet50(include_top=True,weights=None,input_shape=(32,32,3),classes=y_train.shape[1])

That were imported with from tensorflow.keras.applications import VGG19,ResNet50.

All the code used can be found in GitHub and can be downloaded and used (at your own risk, your mileage may vary): https://github.com/toelt-llc/tfrabbit.

Trying different systems gave the results summarised in the figure below.

Benchmarking results for different systems. From below: a standard windows desktop, a macbook pro 16 (2020) with 32 Gb RAM, Google Colab (run in June 2021) and the sytem mentioned at the beginning with RTX A6000.

Results are quite stunning and the system is really fast. Note that with TF 2.5 no code modifications were needed to run on the GPUs.