Date: Sunday June 9th 2019
Time: 2pm – 6.30pm
Advances in deep learning have recently fuelled unprecedented progress across a number of disciplines and, in particular, those of computer vision and natural language. An associated outcome has been the dramatic improvement in the challenging task of ‘machine comprehension’ which depends on both common-sense acquisition and reasoning models. The impact that deep learning has had on these two fundamental problems has likewise enabled progress in applications in machine reading, visual question answering, and imagination-based control. In this half-day workshop, NAVER LABS and Clova AI will demonstrate how these research topics are actively being pursued at an industrial scale. Clova AI is the AI content and service platform of NAVER, South Korea’s internet giant which runs the world’s 5th largest search engine (28 M unique visitors per day) and offers over 130 content services. NAVER is ranked in the Top 10 most innovative global companies (Forbes 2018) and is 6th on Fortune’s ‘Future 50’ list (2018).
The introductory talk describes the various activities of NAVER and their impact across the web industry, in Korea, Asia and beyond. From there, NAVER LABS Europe [Grenoble, France] and Clova AI [South Korea] will be introduced and how these two research groups are addressing machine comprehension and autonomous control in our daily lives. Current research opportunities will be mentioned before presenting the afternoon’s program.
Over the last 5 years, differentiable programming and deep learning have become the-facto standard on a vast set of decision problems of data science. Three factors have enabled this rapid evolution: the availability and systematic collection of large quantities of data with traces of intelligent behaviour; the appearance of standardized development frameworks which has dramatically accelerated differentiable programming and its applications to the major modalities of the numerical world which are image, text and sound; the availability of powerful and affordable computational infrastructures have enabled this advance on the path toward machine intelligence.
Despite these factors, new limits have also arisen which need to be addressed. Automatic common-sense acquisition and reasoning capabilities are two such frontiers that major research labs in machine learning are working on. In this context, human language has once again become a channel of choice of such research. In this talk, machine reading is used as a medium to illustrate the problem and describe the suggested progress in our machine reading research project.
We’ll first describe several of the limitations the current decision models suggest. Within this context, we’ll discuss ReviewQA, a machine reading corpus over human generated hotel reviews, that aims to encourage research around these questions. We’ll then speak of adversarial learning and how this approach makes learning more robust. Finally, we will share our current work on HotpotQA and related models.
References: Adversarial Networks for Machine Reading, TAL 59-02, pp. 77-100, Quentin Grail, Julien Perez and Tomi Silander
Review QA: a relational aspect-based opinion reading dataset, 2018, arXIv
Over the past few years, deep learning algorithms have achieved remarkable results in various areas such as image recognition, speech recognition and machine translation. Despite improvements, the complexity of the neural network does however lead to a gradual increase in fatigue when researchers manually tune the neural network to improve performance. Because of this, research has been actively conducted on algorithms that can automate tunings, such as Hyper-parameter Optimization and Neural Architecture Search. Here we propose a new cloud-based AutoML framework that can efficiently utilise shared computing resources while supporting a variety of AutoML algorithms. By integrating a convenient web-based user interface, visualization, and analysis tools, users can easily control optimization procedures and build useful insights with iterative analysis procedures. We demonstrate the application of our AutoML framework through various tasks such as image recognition and question answering and show that our framework can be more convenient compared to previous work. We also show that our AutoML framework is capable of providing interesting observations through its analysis tools.
N.B. This presentation is a bit different from the rest of the workshop. It does not address deep learning or machine reading, although it does address sequential decision problems.
We consider a version of the classic discrete-time linear quadratic Gaussian (LQG) control problem in which making observations comes at a cost. In the simple case where the system state is a scalar, we find that a simple threshold policy is optimal. This policy makes observations when the posterior variance exceeds a threshold. Although this problem has been studied since the 1960s, ours is the first known proof of this fact. Our presentation gives an intuitive picture of the tools used in the proof (mechanical words, the iteration of discontinuous mappings, Whittle indices), which are simple and powerful yet not widely known.
Our result also gives insight into other Markov decision problems involving a trade-off between the cost of acquiring and processing data, and uncertainty due to a lack of data. In particular, we find a large family of uncertainty cost functions for which threshold policies are optimal. Also, we discuss near-optimal policies for observing multiple time series.
The paper associated with this presentation was recently published in JMLR at http://jmlr.org/papers/volume20/17-185/17-185.pdf
As the interest in video continues to grow, many IT companies have began to apply machine learning on their video content and applications. With the recent rapid development of machine learning models in imaging, the same methodologies are often subsequently applied to video but this approach often fails in performance due to the distinctive characteristics of video.
To exploit video data characteristics, we investigate spatio-temporal modeling methodologies for various tasks such as pose tracking, motion similarity measure and action recognition. During our research, we’ve developed several methods in data augmentation and regularization to train machine learning models. Our methods show performance improvements on each of the models, which improves the quality of the applications. We demonstrate our spatio-temporal approaches for the above tasks with relevant industry applications.
Deep neural networks have given us human-level performance on real-world problems. However, recent studies have shown that that these deep models behave in a way that is fundamentally differently to humans. They easily change predictions when small corruptions such as blur and noise are applied on the input (lack of robustness), and they often produce high confident predictions on out-of-distribution samples (improper uncertainty measure).
In this talk, we focus on the optimization techniques, or regularization methods of deep models while fixing the network architecture as the state-of-the-art models, e.g., ResNet, PyramidNet. For example, by training the model with adversarially generated samples, the network becomes robust against adversarial perturbations.
We’ll first introduce our recently proposed CutMix augmentation technique. Despite its efficiency and simplicity, our experiments show that CutMix outperforms state-of-the-art regularization techniques in various benchmarks. Finally, we’ll introduce our recent study on the robustness and the uncertainty estimation benchmarks of state-of-the-art regularization techniques, and show that a well-regularized model is a powerful baseline with better generalization abilities.