|Stefania Castellani, Aaron Kaplan, Frédéric Roulland, Jutta Willamowski, Antonietta Grasso|
|ICEIS 2009, 11th International Conference on Enterprise Information Systems, Milan, Italy, 6-10 May, 2009. <BR>Please note the attached pdf is a pre-publication version that may be used for research, non-commercial purposes only. <BR>This paper received the best paper award in the area of human-computer interaction at the ICEIS conference 2009|
In an information retrieval system, a thesaurus can be used for query expansion, i.e. adding words to queries in order to improve recall. We propose a semi-automatic and interactive approach for the creation and maintenance of domain-specific thesauri for query expansion. Domain-specific thesauri are especially required in highly technical domains where the use of general thesauri for query expansion introduces more noise than useful results. Our semi-automatic approach to thesaurus creation constitutes a good compromise between fully manual approaches, which produce high-quality thesauri but at a prohibitively high cost, and fully automatic approaches, which are cheap but produce thesauri of limited quality. This article describes our approach and the architecture of the system implementing it, named Cannelle. It exploits user query logs and natural language processing to identify valuable synonymy candidates, and allows editors to interactively explore and validate these candidates in the context of a domain-specific searchable knowledge base. We evaluated the system in the domain of online troubleshooting, where the proposed method clearly yielded an improvement in the quality of the search results obtained.