nGraph Compiler Stack–Beta Release

Deep Learning (DL) computational performance is critical for scientists and engineers applying deep learning techniques to many challenges in healthcare, commerce, autonomous driving, and other domains. This is why in March, we released an early version of Intel nGraph Library and Compiler as an open-source project available on GitHub. It was clear to us early on that open standards and lateral collaboration for interoperability would be essential to help those scientists and engineers achieve the next wave of breakthroughs in their respective fields. Some of our researchers have already started using nGraph to explore next-generation AI topics, such as enabling inference on private data using homomorphic encryption.

Which brings us to today’s announcement: our Beta release of the nGraph Compiler stack. This release focuses on accelerating deep learning inference workloads on Intel® Xeon® Scalable processors and has the following key features:

  • Streamlined out-of-box installation experience for TensorFlow*, MXNet*, and ONNX*.
  • Validated optimization for 20 common workloads available in TensorFlow, 18 in MXNet, and 14 in ONNX.
  • Support for Ubuntu* 16.04 (TensorFlow, MXNet, and ONNX) and MacOS* X 13.x (buildable for TensorFlow and MXNet).

This  includes optimizations built for popular workloads already widely deployed in production. These workloads cover various genres of deep learning including:

  • Image recognition and segmentation
  • Object detection
  • Language translation
  • Speech generation and recognition
  • Recommender systems
  • Generative adversarial network (GAN)
  • Reinforcement learning
Figure 1 nGraph increases MXNet inference performance1

In our tests, the optimized workloads can perform up to 45X faster than native frameworks1, and we expect performance gains for other workloads from our powerful pattern matching feature described below.

Traditionally, to get deep learning performance out of hardware, users had to wait for hardware manufacturers to create and update kernel libraries which expose (sometimes hand-tuned) individual operations in an “immediate mode” execution interface.  And while these kernel optimizations often bring amazing performance gains, they tend to be hardware-specific, which preemptively eliminates any opportunity to optimize at the non-device-specific level. By pairing the non-device-specific and device-specific optimizations, we can unlock even more performance, which is why we built nGraph Compiler.

Our Beta release has many key features:  nGraph was the first graph compiler to enable both training and inference while also supporting multiple frameworks; it allows developers the freedom of completely changing hardware backends underneath their same conceptual model or algorithmic designAny one of these features on their own might be good enough; taken all together, these features give developers assurance that their Neural Network (NN) design can not only grow, but also adapt to myriad changing factors. Adaptability will become increasingly important, as it becomes harder for developers to guess ahead of time the bounds for which they might need to optimize a very large or complex machine learning problem.

In our upcoming Gold release tentatively planned for early second quarter of 2019, we’ll announce further-expanded workload coverage for more frameworks, including additional support for quantized graphs and Int8. Since we designed the nGraph Compiler to support an ever-growing list of AI hardware, early adopters of the Intel® Nervana™ Neural Network Processor and other accelerators will be able to test them using nGraph Compiler throughout 2019. See our ecosystem documentation for further details about what we have in the works. We encourage you to get started by accessing our quick start guide or downloading the latest release. And as always, we welcome any feedback or comments via GitHub.