Motivation:Looking for relevant publications for manual database annotation is a tedious task. In this paper, we show that the combination of natural language processing (NLP) qnd clqssificqtion tools cqn help re-ranking the documents returned by PubMed according to their relevance to SWISS-PROT annotation. Results:With q probabilistic latent categoriser (PLC)we obtained 69% recall and 59% precision for relevant documents in representative query. As the PLC technique provides the relative contribution of each term to the final document score, we used the Kullback-Leibler symmetric divergence to determine the most discriminating words for SWISS-PROT medical annotation. This information should allow curators to better apprehend classification results and has also a great value for fine-tuning the linguistic pre-processing of documents, which in turn can improve the overall classifier performance.

