PyTorch review: A deep learning framework built for speed | Computing

On Jul 18, 2018

Google Chrome adds new AI features to boost productivity and…

Jan 25, 2024

Copilot Pro Brings AI Features to Your Microsoft Office Apps

Jan 16, 2024

Deep learning is an important part of the business of Google, Amazon, Microsoft, and Facebook, as well as countless smaller companies. It has been responsible for many of the recent advances in areas such as automatic language translation, image classification, and conversational interfaces.

We haven’t gotten to the point where there is a single dominant deep learning framework. TensorFlow (Google) is very good, but has been hard to learn and use. Also TensorFlow’s dataflow graphs have been difficult to debug, which is why the TensorFlow project has been working on eager execution and the TensorFlow debugger. TensorFlow used to lack a decent high-level API for creating models; now it has three of them, including a bespoke version of Keras.

CNTK (Microsoft) and Apache MXNet (Amazon) have been the principal competitors to TensorFlow, but there are other framework lineages to consider. Caffe (Berkeley Artificial Intelligence Research Lab), originally for image classification, was expanded and updated to Caffe2 (Facebook and others) and given strong production capabilities. Torch (Facebook, Twitter, Google, and others) uses Lua scripting and a CUDA (Compute Unified Device Architecture) C/C++ back end to efficiently solve problems in machine learning, computer vision, signal processing, and other fields. Despite its strengths as a scripting language, Lua became a liability to Torch when the bulk of the deep learning community adopted Python.

CUDA is Nvidia’s API for its general purpose GPUs. GPUs are much faster than CPUs for training and making predictions from deep neural networks; so are Google’s TPUs (tensor processing units) and FPGAs (field programmable gate arrays), which are available for use on AWS, Microsoft Azure, and elsewhere. In some cases, the use of advanced chips (GPUs, TPUs, or FPGAs) can speed up computations over CPUs by 50x per chip used, reducing training times from weeks to hours or from hours to minutes.