Frederique Segond |
Fostering Language Resources Network (FLaReNet), Venezia, Italy, May 26-27, 2011. |
Since the beginning of humanity, data, in its different forms, has been recognized as essential to knowledge and the principal ingredient of innovation. In this short positioning paper which follows the “Data Information Knowledge and Wisdom (DIKW)” paradigm, we present what is specific to the era of Information Technology. Using the example of rare diseases we conclude that not only the amount of data but the capacity to make sense out of it, learn from it, and turn it into knowledge will speed up the innovation process.
History is full of examples that show how collecting data and making sense of it has been central to radical changes in culture and science. Greek philosophers such as Aristotle were able build a scientific theory with little data, but little by little, the qualitative approach has been complemented with the quantitative as large amounts of data are required to sustain scientific results and theories.
The Ancient Library of Alexandria is one example of data collection in Antiquity that aimed at capturing knowledge from the world for scholars to study and hopefully to innovate.
Monks and later on, copyists were part of the tradition of collecting data and knowledge of the world to learn from them and to then educate others.
At the beginning of the 17th century Galileo collected observations with his telescope and the theory that he developed based on these observations has served as the basis of modern astronomy and which, today, continued to interpret large amounts of data to obtain scientific results.
In the 18th century more and more scientists and philosophers supported observation and experience rather than purely intellectually based theories.
The French naturalist Comte de Buffon influenced peers like Lamarck and Cuvier with the publication of his thirty six volumes of “Histoire naturelle, générale et particulière” and is considered by Darwin as the first author who treated evolution in a scientific manner.
At the same period, led by Diderot, the «Encyclopédie, ou dictionnaire raisonné des sciences, des arts et des métiers” collected data on sciences and mechanical arts with the goal of «changing the way people think”. It is recognized as an important intellectual vector of the French revolution that eventually led to new political models.
In the 19th century Durkheim proposed a scientific approach to society using quantitative methods and gave birth to modern sociology.
In the same century and closer to the domain of Flarenet, linguists and ethnographers such as Sapir and Lévi-Strauss spent their life collecting data on different languages and cultures and influencing the work of several generations of linguists, anthropologists and ethnographers.
What has dramatically changed with the advent of the Internet and Information technologies is that this data which was previously so difficult to collect became, in the course only a few years, extremely easy to access and in much greater quantity. All of a sudden we went from the dream of having more data to the nightmare of data overload or data obesity. Nowadays data are not only of the type of encyclopedic as before but they can be emails, Facebook walls, and exchanges on Twitter. Today, data is gathered not only from the Internet but also from supermarket receipts, mobile phones, cars, planes and soon even refrigerators, ovens and any type of electronic device we use will provide data. Much of the data that previously simply disappeared after having been used for a specific purpose, is now stored, distributed and even resold for analysis, interpretation or other purposes of which the best if not most frequent case is innovation.
The definition of what data is has evolved over the course of history. We adopt the general definition of data as symbols such as words, numbers, codes or tables. These symbols (data) can then be linked into sentences, paragraphs, equation concepts and ideas to give birth to information. Information can then further be structured and interpreted to become knowledge. With recent advances in the semantic web, natural language processing and knowledge management to cite only the most relevant fields for our purpose, the analysis of data has made huge progress. So what’s the link to innovation?
When looking at multiple existing definitions of innovation a difference is often made between invention and innovation. Today Innovation is generally associated with two ingredients: technology and people willing to use or buy this technology, while invention may have no commercial value. Innovation is usually associated with the idea of benefit. Almost any company dealing with data which claims to be innovative communicates on its capacity to turn data into wine to give you a competitive advantage because it performs semantic analysis, knowledge discovery, business intelligence or analytics in general.
What these companies offer their customers is support in understanding their data to make better use of it in marketing, technical development or strategic decisions. There are many examples : One can quote opinion mining for companies selling products of any type including politicians selling a political discourse; being able to make sense out of huge amounts of data is important for the societies of risk that we now live in, be it for homeland security, environmental risk, risk associated with drugs to name but a few. The opportunity of making sense out of data, of linking information generated from different sources and of reasoning based on them has completely changed the way investigations are pursued in law, crime and… medicine.
Medicine has always been a big consumer of data for innovative purposes. The more data a medical domain has the more medical progress is made. National health institutions invest large amounts of time and money to get real user data. For instance blood tests for pregnant women for the early detection of down syndrome or the collection of data on the human genome to enable great progress in treating and curing genetic diseases. To better understand diseases and how to properly prevent and cure them medical doctors need to relate many types of knowledge such as symptoms, treatment, genes, and phenotypes. To do so they use data from collections, communications, publications, patient records and medical archives. In many hospitals there are archives of numerous and very precious data that could be used for epidemiological studies. However data access and links within and across this data is as important as the actual quantity. In the same medical domain the study of rare diseases is, by definition, characterized by the fact that very little data exists. But it is precisely because such data is rare that it is important to capture and link it with other data such as, in the case of rare diseases, data on genes.
We have given examples of how data is the basic block of innovation prior to becoming information and knowledge. We conclude with the fact that the quantity of data alone is not sufficient for innovation. What is equal importance is the ability to link the information carried by this data to discover and develop new paradigms.
NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan France Contact
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments. More details on our research can be found in the Explore section below.
For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research combines expertise in human-robot interaction, natural language processing, speech, information retrieval, data management and low code/no code programming to build AI components that will help next-generation robots perform complex real-world tasks. These components will help robots interact safely with humans and their physical environment, other robots and systems, represent and update their world knowledge and share it with the rest of the fleet. More details on our research can be found in the Explore section below.
Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. Our research combines expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments. More details on our research can be found in the Explore section below.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
1. Difference in female/male salary: 34/40 points
2. Difference in salary increases female/male: 35/35 points
3. Salary increases upon return from maternity leave: Non calculable
4. Number of employees in under-represented gender in 10 highest salaries: 5/10 points
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
——————-
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
1. Les écarts de salaire entre les femmes et les hommes: 34 sur 40 points
2. Les écarts des augmentations individuelles entre les femmes et les hommes : 35 sur 35 points
3. Toutes les salariées augmentées revenant de congé maternité : Incalculable
4. Le nombre de salarié du sexe sous-représenté parmi les 10 plus hautes rémunérations : 5 sur 10 points
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimisation problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimisation to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.
The research we conduct on expressive visual representations is applicable to visual search, object detection, image classification and the automatic extraction of 3D human poses and shapes that can be used for human behavior understanding and prediction, human-robot interaction or even avatar animation. We also extract 3D information from images that can be used for intelligent robot navigation, augmented reality and the 3D reconstruction of objects, buildings or even entire cities.
Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios.
Furthermore, we believe that a modern computer vision system needs to be able to continuously adapt itself to its environment and to improve itself via lifelong learning. Our driving goal is to use our research to deliver embodied intelligence to our users in robotics, autonomous driving, via phone cameras and any other visual means to reach people wherever they may be.
This web site uses cookies for the site search, to display videos and for aggregate site analytics.
Learn more about these cookies in our privacy notice.
You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.
FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.
AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.
Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.
This content is currently blocked. To view the content please either 'Accept social media cookies' or 'Accept all cookies'.
For more information on cookies see our privacy notice.