Cohort analytics: efficiency and applicability

Published by Bernard Omidvar-Tehrani at 6 August 2020

Bernard Omidvar-Tehrani, Sihem Amer-Yahia, Laks V.S. Lakshmanan

The International Journal on Very Large Data Bases (VLDBJ), Springer, August 2020

Abstract

The abundant availability of health-care data calls for effective analysis methods to help medical experts gain a better understanding of their patients and their health. The focus of existing work has been largely on prediction. In this paper, we introduce Core, a framework for cohort “representation” and “exploration”. Our contributions are two-fold: First, we formalize cohort representation as the problem of aggregating the trajectories of its patients. This problem is challenging because cohorts often consist of hundreds of patients who underwent medical actions of various types at different points in time. We prove that producing a representative cohort trajectory is NP-complete with a reduction of the Multiple Sequence Alignment problem. We propose a heuristic that extends the NeedlemanWunsch algorithm for sequence matching to handle temporal sequences. To further improve cohort representation efficiency, we introduce “trajectory families” and “stratified sampling”. Our second contribution is formalizing the problem of cohort exploration as finding a set of cohorts that are similar to a cohort of interest and that maximize entropy. This problem is challenging because the potential number of similar cohorts is huge.
We prove NP-completeness with a reduction of the Maximum Edge Subgraph problem. To address complexity, we develop a multi-staged approach based on limiting the search space to “contrast cohorts”.

Related Content

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2026

All

Publications

Blog

News

Code & Data

Careers

People

Cohort analytics: efficiency and applicability

Related Content

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings