Date: Sunday June 9th 2019
Room: 201
Time: 2pm – 6.30pm
Advances in deep learning have recently fuelled unprecedented progress across a number of disciplines and, in particular, those of computer vision and natural language. An associated outcome has been the dramatic improvement in the challenging task of ‘machine comprehension’ which depends on both common-sense acquisition and reasoning models. The impact that deep learning has had on these two fundamental problems has likewise enabled progress in applications in machine reading, visual question answering, and imagination-based control. In this half-day workshop, NAVER LABS and Clova AI will demonstrate how these research topics are actively being pursued at an industrial scale. Clova AI is the AI content and service platform of NAVER, South Korea’s internet giant which runs the world’s 5th largest search engine (28 M unique visitors per day) and offers over 130 content services. NAVER is ranked in the Top 10 most innovative global companies (Forbes 2018) and is 6th on Fortune’s ‘Future 50’ list (2018).
The introductory talk describes the various activities of NAVER and their impact across the web industry, in Korea, Asia and beyond. From there, NAVER LABS Europe [Grenoble, France] and Clova AI [South Korea] will be introduced and how these two research groups are addressing machine comprehension and autonomous control in our daily lives. Current research opportunities will be mentioned before presenting the afternoon’s program.
Over the last 5 years, differentiable programming and deep learning have become the-facto standard on a vast set of decision problems of data science. Three factors have enabled this rapid evolution: the availability and systematic collection of large quantities of data with traces of intelligent behaviour; the appearance of standardized development frameworks which has dramatically accelerated differentiable programming and its applications to the major modalities of the numerical world which are image, text and sound; the availability of powerful and affordable computational infrastructures have enabled this advance on the path toward machine intelligence.
Despite these factors, new limits have also arisen which need to be addressed. Automatic common-sense acquisition and reasoning capabilities are two such frontiers that major research labs in machine learning are working on. In this context, human language has once again become a channel of choice of such research. In this talk, machine reading is used as a medium to illustrate the problem and describe the suggested progress in our machine reading research project.
We’ll first describe several of the limitations the current decision models suggest. Within this context, we’ll discuss ReviewQA, a machine reading corpus over human generated hotel reviews, that aims to encourage research around these questions. We’ll then speak of adversarial learning and how this approach makes learning more robust. Finally, we will share our current work on HotpotQA and related models.
References: Adversarial Networks for Machine Reading, TAL 59-02, pp. 77-100, Quentin Grail, Julien Perez and Tomi Silander
Review QA: a relational aspect-based opinion reading dataset, 2018, arXIv
Over the past few years, deep learning algorithms have achieved remarkable results in various areas such as image recognition, speech recognition and machine translation. Despite improvements, the complexity of the neural network does however lead to a gradual increase in fatigue when researchers manually tune the neural network to improve performance. Because of this, research has been actively conducted on algorithms that can automate tunings, such as Hyper-parameter Optimization and Neural Architecture Search. Here we propose a new cloud-based AutoML framework that can efficiently utilise shared computing resources while supporting a variety of AutoML algorithms. By integrating a convenient web-based user interface, visualization, and analysis tools, users can easily control optimization procedures and build useful insights with iterative analysis procedures. We demonstrate the application of our AutoML framework through various tasks such as image recognition and question answering and show that our framework can be more convenient compared to previous work. We also show that our AutoML framework is capable of providing interesting observations through its analysis tools.
N.B. This presentation is a bit different from the rest of the workshop. It does not address deep learning or machine reading, although it does address sequential decision problems.
We consider a version of the classic discrete-time linear quadratic Gaussian (LQG) control problem in which making observations comes at a cost. In the simple case where the system state is a scalar, we find that a simple threshold policy is optimal. This policy makes observations when the posterior variance exceeds a threshold. Although this problem has been studied since the 1960s, ours is the first known proof of this fact. Our presentation gives an intuitive picture of the tools used in the proof (mechanical words, the iteration of discontinuous mappings, Whittle indices), which are simple and powerful yet not widely known.
Our result also gives insight into other Markov decision problems involving a trade-off between the cost of acquiring and processing data, and uncertainty due to a lack of data. In particular, we find a large family of uncertainty cost functions for which threshold policies are optimal. Also, we discuss near-optimal policies for observing multiple time series.
The paper associated with this presentation was recently published in JMLR at http://jmlr.org/papers/volume20/17-185/17-185.pdf
As the interest in video continues to grow, many IT companies have began to apply machine learning on their video content and applications. With the recent rapid development of machine learning models in imaging, the same methodologies are often subsequently applied to video but this approach often fails in performance due to the distinctive characteristics of video.
To exploit video data characteristics, we investigate spatio-temporal modeling methodologies for various tasks such as pose tracking, motion similarity measure and action recognition. During our research, we’ve developed several methods in data augmentation and regularization to train machine learning models. Our methods show performance improvements on each of the models, which improves the quality of the applications. We demonstrate our spatio-temporal approaches for the above tasks with relevant industry applications.
Deep neural networks have given us human-level performance on real-world problems. However, recent studies have shown that that these deep models behave in a way that is fundamentally differently to humans. They easily change predictions when small corruptions such as blur and noise are applied on the input (lack of robustness), and they often produce high confident predictions on out-of-distribution samples (improper uncertainty measure).
In this talk, we focus on the optimization techniques, or regularization methods of deep models while fixing the network architecture as the state-of-the-art models, e.g., ResNet, PyramidNet. For example, by training the model with adversarially generated samples, the network becomes robust against adversarial perturbations.
We’ll first introduce our recently proposed CutMix augmentation technique. Despite its efficiency and simplicity, our experiments show that CutMix outperforms state-of-the-art regularization techniques in various benchmarks. Finally, we’ll introduce our recent study on the robustness and the uncertainty estimation benchmarks of state-of-the-art regularization techniques, and show that a well-regularized model is a powerful baseline with better generalization abilities.
NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan France Contact
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments. More details on our research can be found in the Explore section below.
For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research combines expertise in human-robot interaction, natural language processing, speech, information retrieval, data management and low code/no code programming to build AI components that will help next-generation robots perform complex real-world tasks. These components will help robots interact safely with humans and their physical environment, other robots and systems, represent and update their world knowledge and share it with the rest of the fleet. More details on our research can be found in the Explore section below.
Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. Our research combines expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments. More details on our research can be found in the Explore section below.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
—————
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
1. Difference in female/male salary: 34/40 points
2. Difference in salary increases female/male: 35/35 points
3. Salary increases upon return from maternity leave: Non calculable
4. Number of employees in under-represented gender in 10 highest salaries: 5/10 points
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
——————-
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
1. Les écarts de salaire entre les femmes et les hommes: 34 sur 40 points
2. Les écarts des augmentations individuelles entre les femmes et les hommes : 35 sur 35 points
3. Toutes les salariées augmentées revenant de congé maternité : Incalculable
4. Le nombre de salarié du sexe sous-représenté parmi les 10 plus hautes rémunérations : 5 sur 10 points
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimisation problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimisation to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.
The research we conduct on expressive visual representations is applicable to visual search, object detection, image classification and the automatic extraction of 3D human poses and shapes that can be used for human behavior understanding and prediction, human-robot interaction or even avatar animation. We also extract 3D information from images that can be used for intelligent robot navigation, augmented reality and the 3D reconstruction of objects, buildings or even entire cities.
Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios.
Furthermore, we believe that a modern computer vision system needs to be able to continuously adapt itself to its environment and to improve itself via lifelong learning. Our driving goal is to use our research to deliver embodied intelligence to our users in robotics, autonomous driving, via phone cameras and any other visual means to reach people wherever they may be.
This web site uses cookies for the site search, to display videos and for aggregate site analytics.
Learn more about these cookies in our privacy notice.
You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.
FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.
AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.
Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.
This content is currently blocked. To view the content please either 'Accept social media cookies' or 'Accept all cookies'.
For more information on cookies see our privacy notice.