GOTA: GO term annotation of biomedical literature

Last modified by Giacomo Domeniconi on 2020/10/12 16:44

Pietro Di Lena, Giacomo Domeniconi, Luciano Margara, Gianluca Moro

Background Functional annotation of genes and gene products is a major challenge in the post-genomic era. Nowadays, gene function curation is largely based on manual assignment of Gene Ontology (GO) annotations to genes by using published literature. The annotation task is extremely time-consuming, therefore there is an increasing interest in automated tools that can assist human experts.

Results Here we introduce GOTA, a GO term annotator for biomedical literature. The proposed approach makes use only of information that is readily available from public repositories and it is easily expandable to handle novel sources of information. We assess the classification capabilities of GOTA on a large benchmark set of publications. The overall performances are encouraging in comparison to the state of the art in multi-label classification over large taxonomies. Furthermore, the experimental tests provide some interesting insights into the potential improvement of automated annotation tools.

Conclusions GOTA implements a flexible and expandable model for GO annotation of biomedical literature. 

Keywords: Automated annotation; Text mining; Gene Ontology
BMC Bioinformatics 16, pages 346, october 2015
	year = 2015,
	keywords = { Automated annotation; Text mining; Gene Ontology },
	status = {Published},
	venue_list = {--},
	scopusId = {2-s2.0-84945575113},
	url = {},
	month = {october},
	journal = {BMC Bioinformatics},
	author = {Di Lena, Pietro and Domeniconi, Giacomo and Margara, Luciano and Moro, Gianluca},
	title = {GOTA: GO term annotation of biomedical literature},
	pages = 346,
	volume = 16,
	doi = {10.1186/s12859-015-0777-8}}

2011 © aliCE Research Group @ DEIS, Alma Mater Studiorum-Università di Bologna