Four Superpowers of Deep Learning

Last week, I had the pleasure of taking part in MIT’s 2019 EmTech conference in Cambridge, Massachusetts. I joined Elizabeth Bramson-Boudreau, CEO of the MIT Technology Review, onstage to discuss the evolution of artificial intelligence (AI)  applications toward deployment at scale. (If you missed the event, you can still catch a replay of my talk). It is my firm belief that nearly all technologies and compute workloads can be improved and enriched by AI and deep learning (DL). In fact, there are specific characteristics of DL that organizations can leverage today as they deploy AI to transform their workflows and develop new innovations. These four “superpowers” of DL are pattern spotting, universal approximation, sequence mapping, and similarity-based generation.

Pattern spotting

Pattern spotting is the ability to identify patterns in a sea of “noisy data”. This capability was first demonstrated with superior performance in analyzing images for object identification, but it can be deployed to other types of data. While traditional machine learning techniques have been used for years in fraud detection, for example, deep learning is very powerful in identifying remote instances of a pattern. In the last few years, deep learning techniques and advances in compute have pushed pattern recognition techniques to levels that are on-par with or better than human capabilities in some applications, creating new ways to address problems that have long challenged data scientists.

One such problem is affect/emotion recognition, or the real-time identification of emotions based on facial expression. This capability is being applied in many different ways in the industry today. One example is a kit for motorized wheelchairs that lets users with severe mobility restrictions operate their wheelchair through expressions on their face. Users can pick from 10 facial expressions to control their motorized wheelchair. This project, developed by robotics company Hoobox, uses a 3D camera to stream data that AI algorithms, aided by pattern spotting, can process in real time to control the chair. Advanced compute at the edge enables immediate responsiveness, which is key for the usability of this product.

If you have large amounts of data of any kind — images, documents, medical records, agriculture land surveys, etc — DL can be applied to identify faint patterns and extract hidden insights.

Universal Approximation

The second superpower of deep learning is universal approximation, or the ability to learn a complex system and replace computationally intensive and time-consuming calculations with estimations that are much more efficient (by a factor of 10 to the 4th power), yet maintain acceptable levels of precision. By learning the correlation between input and output, this technique enables predictions about results and can be used for applications as diverse as sorting compounds in drug discovery or minimizing delays in flight routes, while using a fraction of computation time and power. Whatever you can accelerate by 10,000x might change your business.

CERN, the European organization for nuclear research, has applied this technique at their Large Hadron Collider (LHC) particle accelerator. Today, the LHC generates data at a staggering rate of 25 GB per second. Modeling, filtering, and analyzing this enormous amount of data is ideally suited to deep learning algorithms. One application is the simulation of collision events. Particle physics analysis happens in multiple phases, and decisions have to be made at each of these phases as to which data to analyze further. The decision of whether to save or throw away an event has to be made in a microsecond, as a tremendous number of events needs to be simulated. Deep learning has the potential to learn the properties of the reconstructed particles and bypass the complicated simulation process, potentially leading to simulations that are orders of magnitude faster than those currently available – and more scientific discoveries in less time.

If you have a demanding, complex calculation as part of your application, let a DL system observe the inputs turn into outputs. Once learning is complete, DL inference can be used for approximation for new inputs providing fairly accurate estimation at up to 10,000x shorter time.

Sequence Mapping

We’ve now come to the third superpower of deep learning, sequence mapping. The most prominent example of this capability is computer-based translation. Today, we are able to translate across many languages with good levels of accuracy, and in real time. Sequence mapping can be applied to a number of applications by interpreting a series of words, images or other tokens and converting them to a new sequence, while taking into account what output series is most likely. In the case of language, this seems obvious, since words are produced (either in written form or verbally) one at a time. On an intuitive level, you will implement two neural networks: one for the original language, and the other for the target language for translation. A mapping is built across sequences of the original and target language (sentences), as opposed to individual samples (words).

This can also be used to ‘clean up’ noisy sequences that include some potential errors, and generate noise-free appropriate sequences. An example would be reading of DNA sequences which sometimes include some level of noise, and producing clean DNA sequences.

If you need real-time translation of speech in context, financial time series data, or an unstructured sensor input, DL can be applied.

Similarity-Based Generation

Our final DL superpower is the creation of new images, speech or other data that has never existed but is remarkably similar to real occurrences. An example is the use of AI to make photorealistic videos. Similar techniques can be applied to much more useful purposes.

In healthcare, the MGH & BWH Center for Clinical Data Science recently used generative adversarial networks (GANs) to create new MRIs of brain tumors as a low-cost solution to the issue of imbalanced data sets that lack rare pathologic findings. GANs are able to generate new data with the same statistics as the training set. By training a GAN to generate synthetic abnormal MRIs, data scientists were able to improve tumor segmentation and save doctors hours of time.

One example is Google’s high-fidelity speech synthesis, which converts text into human-like speech in more than 180 voices across 30+ languages and variants. It applies WaveNet, groundbreaking research in speech synthesis, to generate lifelike interactions that can transform customer service, device interaction, and other applications.

Creative businesses can find the use cases to apply similarity-based generation of images, speech or training data to improve their customer interaction or the product representation.

These four superpowers of deep learning are exciting – but they don’t exist in a vacuum. They need the right organizational elements to reach their full potential. Among these are a highly committed management team that supports the adoption of AI-enriched practices and can overcome initial challenges and experimentation. Another critical element is quality data. In many cases, data needs curating, cleaning, and labeling to make it suitable for DL. Finally, these superpowers need superheroes: empowered data scientists who have the skills and support to keep pace with the latest advancements in AI. I look forward to seeing what’s next when countless applications and usages across all industries are enriched with AI and improve everyday life as they take full advantages of the superpowers of deep learning.