On Context-Diverse Repeats and their Incremental Computation - Naver Labs Europe
loader image

The context in which a substring appears is an important notion to identify – for example – its semantic meaning. However, existing classes of repeats fail to take this into account directly. We present here xkcd-repeats, a new family of repeats characterized by the number of different symbols at the left and right of their occurrences. These repeats include as special extreme cases maximal and super-maximal repeats. We give sufficient and necessary condition to bound their number linearly in the size of the sequence, and show an optimal algorithm that computes them in linear time – given a suffix array – independent on the size of the alphabet, as well as two other algorithms that are faster in practice. Additionally, we provide an independent and general framework that allows to compute these (and other) repeats incrementally; extending the application space of repeats in a streaming framework.

NAVER LABS Europe
NAVER LABS Europe
Ceci correspond à une petite biographie d'environ 200 caractéres