DIY GPU server: Build your own PC for deep learning

Building your own GPU server isn't hard, and it can easily beat the cost of training deep learning models in the cloud

DIY GPU server: Build your own PC for deep learning

There comes a time in the life of many deep learning practitioners when they get the urge to build their own deep learning machine and escape the clutches of the cloud. The cloud is ideal for getting started with deep learning, and it is often the best answer for training large-scale deep learning models. But there is a vast area in between where having your own deep learning box can be significantly more cost-effective.

Not that it’s cheap. You will spend from $1,500 to $2,000 or more on a computer and high-end GPU capable of chewing through deep learning models. But if you’re doing extensive model training for days at a time, then having your own dedicated machine could pay for itself in three or four months—especially when you factor in cloud storage and ingress costs alongside compute time.

In this article, I’ll walk you through the deep learning machine I built earlier in the year, describing some of the choices you’ll encounter when building such a machine and the costs you’ll likely incur. Prices are direct quotes from Amazon as of December 2017.

If you want to get deeper into deep learning—whether that means research on larger datasets or entering Kaggle competitions, or both—building your own deep learning box makes a lot of sense. Running models on your own machine will likely be the best approach until you start doing work on huge datasets and require tens of GPUs for your training.

Once you reach that point the cloud becomes your friend again. You may also need a friend with deep pockets to pay for all that compute time.

Deep learning PC build: GPU and CPU

The decision of which graphics card, and hence which GPU you buy, is likely to be the most important (and expensive) decision you’ll make when building your own machine. But right now, that choice is a fairly easy one. Providing you don’t recoil at the $700 to $900 price tag, you should buy the Nvidia GeForce GTX 1080 Ti. The Nvidia 1080 Ti is not as capable as the new Nvidia Volta generation of GPUs, which are just now appearing in the major cloud providers, but it will likely be enough for all your Kaggle needs.

The 1080 Ti is based on the same Pascal architecture as Nvidia’s bleeding-edge Titan X card, but it is faster and cheaper. Lots of cores (3584) and lots of memory (11GB) mean that you can run larger neural networks and train them faster on the 1080 Ti than on any of the other consumer-class cards on the market today. If you want to push the envelope on any part of your deep learning box, it makes sense to go with the best sub-$1000 CPU you can get.

The 1080 Ti is available in a Founder’s Edition (where the card is the Nvidia reference design) or a custom design from a graphics cards manufacturer. These will normally have multiple fans and (betraying their gaming origins) a facility for overclocking the GPU. Personally, I’m paranoid about overheating so I plumped for an $895 card with three fans, but you can likely save yourself $100 by going for a Founder’s Edition card instead.

If you can’t justify the high price of the 1080Ti, then the Nvidia GeForce GTX 1080 is a decent fallback. It is slower than the 1080 Ti and has less memory (8GB vs. 11GB), but it will save you about $200. I’d like to be able to recommend AMD’s new line of GPUs, but the support for major libraries like PyTorch and Tensorflow isn’t quite there yet. This will likely change during 2018 as AMD continues its work on ROCm, at which point the Radeon RX Vega Instinct and Frontier will look very appealing, especially if Nvidia neglects to bring its Tensor Cores from the Volta platform into its consumer line.

Although the GPU is going to be the workhorse of your system, you’re also going to need a capable CPU for running applications and handling data engineering tasks (e.g. augmentation). If you have an eye on expanding your machine to include multiple GPUs in the future, then you should get a CPU that can handle up to 40 PCIe Express lanes. You don’t have to go out on the bleeding edge here. A 7th Generation Intel Core i7-7700K is a decent choice. You’ll also need a cooler, but again, you won’t need esoteric water cooling or anything like that. A $30 Hyper 212 EVO from Cooler Master is just fine.

Deep learning PC build: Storage and memory

Now, you could throw caution to the wind and spend $500 for a 1TB SSD M.2 card. SSDs are great, right? But think about it. You’re not going to be using all that fast storage when you’re training a model on data. A more efficient use of your money would be to split storage between “hot” (SSD) for current training and “cold” (platters) for inactive projects. A 2TB HDD will cost you around $70 and a 250GB SSD around $130, giving you more than double the storage for less than half the price.

As for memory, I’d recommend 32GB—with a possible further expansion in the future to 64GB, so buy 16GB sticks. Most motherboards you’ll be looking at will require DDR4 memory. Two 16GB DIMMs will set you back $300 to $400.

Deep learning PC build: Motherboard, power supply, and case

Now that you have all of these fancy bits of kit, you’ll additionally need some relatively boring parts to bring everything together. If you followed my advice and bought a 7th Generation Intel CPU, then you should go for a Z270-based motherboard. If you bought an 8th Generation CPU because you like shiny new things, then you will need a Z370-based motherboard. Don’t get them mixed up, as a 7th Generation CPU won’t work in a Z370 board (make sure to double-check that your motherboard and CPU will work together). If you’re thinking you might get another GPU in the future, check that your choice of motherboard can support it. I didn’t follow my own advice here, so I’ll need to upgrade my motherboard if I ever decide to add a second GeForce GTX 1080 Ti.

Power supplies are perhaps the least exciting part of building your own box, but one thing to remember here is that you’re going to need quite a bit of power. An 850W power supply can be yours for $85 and will give you some room for expansion.

Now, the last time I built a desktop PC was over 10 years ago, so I was completely unprepared for the brave new world of cases. I was not expecting the glittering LEDs, the racing curves, or the fans as far as the eye can see. Being reserved and British, I went for a rather boring white tower design, but it does have a glass window in the side. Make sure that you’ll have room to fit all of the pieces in the case. Both the cooler and the graphics card will stick out higher than you might think. If you stick with a standard ATX tower or mid-tower case you’ll be fine.

Deep learning PC build: Assembly and software installation

Because I hadn’t put a PC together in years, I was a touch worried that I’d get lost when it came to assembly. However, all you need is a decent screwdriver, an anti-static strap, and YouTube. For almost every part of your system, there will be an extensive walkthrough video online that will show you exactly how to proceed. The videos really helped me through the SSD card and memory installations, where I needed reassurance that I wouldn’t break the memory chips by pushing down rather hard to get them into the motherboard.

Once you have everything in your case and all the lights come on, you’re going to need an OS. Feel free to dual-boot with Windows for games, but you should probably go with Linux for your deep learning work. I recommend Ubuntu 17.04 at the time of writing for its easy Nvidia driver setup (and the Nvidia download page hasn’t updated to Ubuntu 17.10 yet) . One thing to remember before installing Linux is to go into your machine’s BIOS and make sure that it’s using the integrated graphics rather than the 1080Ti, or else you’ll run into problems during the install.

Follow the Ubuntu guide for setting up the Nvidia drivers. Also, because Tensorflow won’t support CUDA 9 until Tensorflow 1.5, grab the CUDA 8 libraries from the CUDA website.

For Python, I recommend using Anaconda. Getting Tensorflow and Keras installed with Anaconda is as simple as these commands:

conda install -c anaconda tensorflow-gpu
conda install -c anaconda keras

And PyTorch is almost as simple:

conda install pytorch torchvision -c pytorch          

Plus Anaconda comes with Jupyter installed by default, so at this point you have everything you’re likely to need to get started using your custom deep learning machine.

Deep learning PC build: Total cost

Finally, let’s take a look at how much all of this costs. I bought all of my equipment from Amazon. You may be able to find somewhat better deals using sites like Newegg or TigerDirect, or by scouring eBay for used parts. And again, you could save $100 or $200 on the GPU by choosing an Nvidia Founder’s Edition or plain GTX 1080 card.  

At just under $2,100 (before any sales tax kicks in, remember) a deep learning PC is definitely a major investment. But if you’re starting to wince at the rising costs in your monthly cloud billing statements, it’s worth investigating whether building your own machine is a better use of your time and money than spinning up GPU instances on AWS, Google Cloud, or Microsoft Azure. It certainly paid off for me.

At a Glance
  • EVGA GeForce GTX 1080 Ti SC Black Edition GAMING, 11GB GDDR5X, iCX Cooler & LED, Optimized Airflow Design, Interlaced Pin Fin Graphics Card 11G-P4-6393-KR

  • Intel Core i7-7700K

  • Cooler Master Hyper 212 EVO - CPU Cooler

  • Samsung 960 EVO Series - 500GB NVMe - M.2 Internal SSD (MZ-V6E500BW)

  • Western Digital WD Blue 1TB SATA 6 Gb/s 7200 RPM 64MB Cache 3.5 Inch Desktop Hard Drive (WD10EZEX)

  • Ballistix Sport LT 32GB Kit (16GBx2) DDR4 2400 MT/s (PC4-19200) DIMM 288-Pin - BLS2K16G4D240FSB (Gray)

  • MSI Pro Series Intel Z270 DDR4 USB 3 CrossFire ATX Motherboard (Z270-A PRO)

  • Rosewill Glacier Series Continuous 80 Plus Bronze Certified Semi-Modular Design ATX12V/EPS12V 850W Power Supply Glacier 850M

  • NZXT S340 Mid Tower Computer Case, White (CA-S340W-W1)

Copyright © 2018 IDG Communications, Inc.

How to choose a low-code development platform