What’s New: Intel is presenting a series of research papers that further the development of computer vision and pattern recognition software and have the potential to radically transform future technologies across industries – from industrial applications to healthcare to education. The Intel team presented their research, which leverages artificial intelligence (AI) and ecosystem-sensing technologies to build complete digital pictures of physical spaces, during the Conference on Computer Vision and Pattern Recognition (CVPR), the premier annual computer vision event in Long Beach, California, from June 16-20.
“Intel believes that technology – including the applications we’re showcasing at CVPR – can unlock new experiences that can transform the way we tackle problems across industries, from education to medicine to manufacturing. With advancements in computer vision technology, we can program our devices to help us identify hidden objects or even enable our machines to teach human behavioral norms.”
–Dr. Rich Uhlig, Intel Senior Fellow and managing director of Intel Labs
Some of the Intel research presented this week includes:
Seeing Around Corners with Sound
Title: Acoustic Non-Line-of-Sight Imaging by David B. Lindell (Intel Labs), Gordon Wetzstein (Stanford University), Vladlen Koltun (Intel Labs)
Why It Matters: In this paper, Intel demonstrates the ability to construct digital images and see around corners using acoustic echoes. Non-line-of-sight (NLOS) imaging technology enables unprecedented capabilities for applications including robotics and machine vision, remote sensing, autonomous vehicle navigation and medical imaging. This acoustic method can reconstruct hidden objects using inexpensive off-the-shelf hardware at longer distances with shorter exposure times compared with the leading alternative NLOS imaging technologies. In this solution, Intel demonstrates how a system of speakers can emit sound waves and leverage microphones that capture the timing of the returning echoes to inform reconstruction algorithms – inspired by seismic imaging – to build a digital picture of a physical object that is hidden from view. Watch the demonstration.
Abstract: Intel demonstrates a new approach to seeing around corners using acoustic echoes. The solution is orders of magnitude less expensive than alternative non-line-of-sight imaging technologies, which are based on optical imaging. The new technology is able to see farther and faster around corners than state-of-the-art optical methods.
Using ‘Applied Knowledge’ to Train Deep Neural Networks
Title: Deeply-supervised Knowledge Synergy for Advancing the Training of Deep Convolutional Neural Networks by Dawei Sun (Intel Labs), Anbang Yao (Intel Labs), Aojun Zhou (Intel Labs), Hao Zhao (Intel Labs)
Why It Matters: AI applications including facial recognition, image classification, object detection and semantic image segmentation leverage technologies inspired by biological neural structures, Deep Convolutional Neural Networks (CNNs), to process information and efficiently find answers. However, leading CNNs are a challenge to train, requiring a large number of hierarchically-stacked parameters for operation, and the more complex they become, the longer they take to train and the more energy they consume. In this paper, Intel researchers present a new training scheme – called Deeply-supervised Knowledge Synergy – that creates “knowledge synergies” and essentially enables the CNN to transfer what it has learned through layers of the network to improve training and performance of CNNs, improving accuracy, noisy data management and data recognition.
Abstract: Intel researchers present a novel training scheme, coined Deeply-supervised Knowledge Synergy (DKS), which can learn prevalent CNNs with much better performance against the currently mainstream scheme. In a sharp contrast to existing knowledge transfer designs dedicated to different CNN models, DKS shapes a new concept of knowledge transfer across different layers of a CNN. Through extensive experiments on public benchmarks, the models trained with DKS show much better performance compared with state-of-the-art training schemes.
Delivering Formative Feedback for Autistic Children Behavioral Therapy
Title: Interpretable Machine Learning for Generating Semantically Meaningful Formative Feedback by Nese Alyuz (Intel Labs) and Tevfik Metin Sezgin (Koc University)
Why It Matters: We express our emotional state through a range of expressive modalities, such as facial expressions, vocal cues or body gestures. However, children on the autism spectrum experience difficulties in expressing and recognizing emotions with the accuracy of their neurotypical peers. Research shows that children on the autism spectrum can be trained to recognize and express emotions if they are given supportive and constructive feedback. In particular, providing formative feedback, e.g., feedback given by an expert describing how they need to modify their behavior to improve their expressiveness, has been found valuable in rehabilitation. Unfortunately, generating such formative feedback requires constant supervision of an expert who assesses each instance of emotional display. In this paper, an interpretable machine learning framework is demonstrated to provide the foundation for a system that monitors emotional inputs and generates formative recommendations to modify human behavior in order to achieve the appropriate expressive display.
Abstract: In this paper, a system is introduced for automatic formative assessment integrated into an automatic emotion recognition setup. The system is built on an interpretable machine learning framework that identifies a behavior that needs to be modified to achieve a desired expressive display. We report experiments conducted on a children’s voice data set with expression variations, showing that the proposed mechanism generates formative feedback aligned with the expectations reported from a clinical perspective.
The First Large-Scale Benchmark for 3D Object Understanding
Title: PartNet: A Large-Scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding by Kaichun Mo (Stanford University), Shilin Zhu (University of California San Diego), Angel X. Chang (Simon Fraser University), Li Yi (Stanford University), Subarna Tripathi (Intel AI Lab), Leonidas J. Guibas (Stanford University), Hao Su (University of California San Diego)
Why It Matters: Identifying objects and their parts is critical to how humans understand and interact with the world. For example, using a stove requires not only identifying the stove itself, but also its subcomponents: its burners, control knobs and more. This same capability is essential to many AI vision, graphics and robotics applications, including predicting object functionality, human-object interaction, simulation, shape editing and shape generation. This wide range of applications spurred great demand for large 3D datasets with part annotations. However, existing 3D shape datasets provide part annotations only on a relatively small number of object instances or on coarse, yet non-hierarchical, part annotations, making these datasets unsuitable for applications involving understanding. In other words, if we want AI to be able to make us a cup of tea, large new datasets are needed to better support the training of visual AI applications to parse and understand objects with many small details or with important subcomponents. More information on PartNet here.
Abstract: In this paper, Intel presents PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level and hierarchical 3D part information. Using our dataset, we establish three benchmarking tasks for evaluating 3D part recognition, and we benchmark four state-of-the-art 3D deep learning algorithms against the criteria. We then introduce a novel method for part instance segmentation and demonstrating its superior performance over existing methods.