Abstract
Maziar Moradi Fard, Thibaut Thonet, Eric Gaussier |
European Conference on Information Retrieval (ECIR), Lisbon, Portugal (virtual event), 14-17 April, 2020 |
Download |
Abstract
Different users may be interested in different clustering views underlying a given collection (e.g., topic and writing style in documents). Enabling them to provide constraints reflecting their needs can then help obtain tailored clustering results. For document clustering, constraints can be provided in the form of seed words, each cluster being characterized by a small set of words. This seed-guided constrained document clustering problem was recently addressed through topic modeling approaches. In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Constrained Document Clustering approach. Its effectiveness is demonstrated on five public datasets.
NAVER LABS Europe 6-8 Chemin de Maupertuis 38240 Meylan France Contact