What’s New: This week at the annual Neural Information Processing Systems (NeurIPS) conference in Vancouver, British Columbia, Intel is contributing almost three dozen conference, workshop and spotlight papers covering deep equilibrium models, imitation learning, machine programming and more.
“Intel continues to push the frontiers in fundamental and applied research as we work to infuse AI everywhere, from low-power devices to data center accelerators. This year at NeurIPS, Intel will present almost three dozen conference and workshop papers. We are fortunate to collaborate with excellent academic communities from around the world on this research, reflecting Intel’s commitment to collaboratively advance machine learning.”
–Hanlin Tang, senior director, AI Lab at Intel
Research topics span the breadth of artificial intelligence (AI) topics, from fundamental understanding of neural networks to applying machine learning to software programming to particle physics. A few highlights are shown below:
Automating Software Testing
Title: A Zero-Positive Learning Approach for Diagnosing Software Performance Regression by Mejbah Alam (Intel Labs), Justin Gottschlich (Intel Labs), Nesime Tatbul (Intel Labs and MIT), Javier Turek (Intel Labs), Timothy Mattson (Intel), Abdullah Muzahid (Texas A&M University)
Software development automated with machine learning (ML) is an emerging field. The long-term vision is to augment programmers with ML-driven tools to test code, write new code and diagnose errors. The paper proposes an approach to automate regression testing (errors introduced by new code check-ins) in high-performance computing code, termed AutoPerf. Leveraging only nominal training data and utilizing hardware performance counters while running code, we illustrate that AutoPerf can detect some of the most complex performance bugs found in parallel programming.
See the presentation: A Zero-Positive Learning Approach for Diagnosing Software Performance Regression
“Intel is making significant strides in advancing and scaling neural network technologies to handle increasingly complex and dynamic workloads – from tackling challenges with memory to researching new adaptive learning techniques,” said Dr. Rich Uhlig, Intel senior fellow and managing director of Intel Labs. “The developments we’re showcasing at NeurIPS will help reduce memory footprints, better measure how neural networks process information and reshape how machines learn in real time, opening up the potential for new deep learning applications that can change everything from manufacturing to healthcare.”
Teaching Robots Through Imitation Learning
Title: Goal-Conditioned Imitation Learning by Yiming Ding (University of California, Berkeley), Carlos Florensa (University of California, Berkeley), Pieter Abbeel (University of California, Berkeley and Covariant.ai), Mariano Phielipp (Intel AI Lab)
The long-term goal of this research effort is to build robotic algorithms that can learn quickly and easily from human demonstrations. Although learning by human demonstration is a well-studied topic in robotics, current work cannot surpass the human expert, is susceptible to non-perfect human teachers, and cannot adapt to unseen situations. The paper introduces a newly developed algorithm, goalGAIL. Using goalGAIL, the robot demonstrates the ability to learn better than the expert and can even perform in situations with non-expert actions. This will broaden robotic applications across practical robotics where the demonstrator not need to be an expert; industrial settings where algorithms may need to adapt quickly to new parts; and personalized robotics where the algorithm must adapt through demonstration to personal preference.
See the presentation: Goal-Conditioned Imitation Learning
New Approach to Sequence Models
Title: Deep Equilibrium Models by Shaojie Bai (Carnegie Mellon), J. Zico Kolter (Carnegie Mellon), Vladlen Koltun (Intel Labs)
In this spotlight paper at NeurIPS (2% acceptance rate), we develop a radically different approach to machine learning on sequence data. We are able to replace deep recurrent layers with a single-layer model. Instead of iterating through a sequence of layers, we instead solve directly for the final representation via root-finding. This new type of model can match state-of-the-art performance on language benchmarks, but with a single layer, reducing the memory footprint by 88%. This opens the door to building larger and more powerful models.
See the presentation: Deep Equilibrium Models
4-bit Training Without Retraining
Title: Post-Training 4-bit Quantization of Convolutional Networks for Rapid-Deployment By Ron Banner (Intel AI Lab), Yury Nahshan (Intel AI Lab), Daniel Soudry (Technion)
A convolutional neural network is a class of deep neural networks most commonly applied to analyzing visual imagery that requires substantial computing resources, memory bandwidth and storage capacity. To accelerate the speed of analysis, the models are often quantized to lower bits. However, such methods often require full datasets and time-consuming fine-tuning to recover the accuracy lost after quantization. This paper introduces the first practical 4-bit post-training quantization approach that does not involve training the quantized model (fine-tuning) or require the availability of the full dataset. The approach achieves accuracy that is just a few percent less than the state-of-the-art baseline across a wide range of convolutional models.
See the presentation: Post-Training 4-bit Quantization of Convolutional Networks for Rapid-Deployment | Presentation Slides
Understanding Neural Networks
Title: Untangling in Invariant Speech Recognition by Cory Stephenson (Intel AI Lab), Suchismita Padhy (Intel AI Lab), Hanlin Tang (Intel AI Lab), Oguz Elibol (Intel AI Lab), Jenelle Feather (MIT), Josh McDermott (MIT), SueYeon Chung (MIT)
A neural network is often referred to as a “black box” because parts of its decision-making are famously opaque. There has been a plethora of approaches to try to peer into the box, but the challenge has been that many of the measures are not theoretically grounded. In collaboration with MIT, we’ve applied some of theoretically-grounded measures of manifold capacity to better understand the geometry of speech recognition models. Theoretically-grounded measurements are rare in deep learning, and the work seeks to provide a unique view on how neural networks process information.
See the presentation: Untangling in Invariant Speech Recognition