Random Perturbations of Term Weighted Gene Ontology Annotations for Discovering Gene Unknown Functionalities


Giacomo Domeniconi, Marco Masseroli, Gianluca Moro, Pietro Pinoli

Computational analyses for biomedical knowledge discovery greatly benefit from the availability of the description of gene and protein functional features expressed through controlled terminologies and ontologies, i.e. of their controlled annotations. In the last years, several databases of such annotations have become available; yet, these annotations are incomplete and only some of them represent highly reliable human curated information. To predict and discover unknown or missing annotations existing approaches use unsupervised learning algorithms. We propose a new learning method that allows applying supervised algorithms to unsupervised problems, achieving much better annotation predictions. This method, which we also extend from our preceding work with data weighting techniques, is based on the generation of artificial labeled training sets through random perturbations of original data. We tested it on nine Gene Ontology annotation datasets; obtained results demonstrate that our approach achieves good effectiveness in novel annotation prediction, outperforming state of the art unsupervised methods.

(keywords) Gene ontology; Biomolecular annotation prediction; Bioinformatics; Knowledge discovery; Supervised learning; Term weighting

Knowledge Discovery, Knowledge Engineering and Knowledge Management, Communications in Computer and Information Science 553, pages 181-197,  2015.
Ana Fred, Jan L. G. Dietz, David Aveiro, Kecheng Liu, Joaquim Filipe (eds.), Springer International Publishing.

@article{,
booktitle = {Knowledge Discovery, Knowledge Engineering and Knowledge Management},
year = 2015,
keywords = {Gene ontology; Biomolecular annotation prediction; Bioinformatics; Knowledge discovery; Supervised learning; Term weighting},
status = {Published},
url = {http://dx.doi.org/10.1007/978-3-319-25840-9_12},
editor = {Fred, Ana and Dietz, Jan L. G. and Aveiro, David and Liu, Kecheng and Filipe, Joaquim},
series = {Communications in Computer and Information Science},
publisher = {Springer International Publishing},
author = {Domeniconi, Giacomo and Masseroli, Marco and Moro, Gianluca and Pinoli, Pietro},
title = {Random Perturbations of Term Weighted Gene Ontology Annotations for Discovering Gene Unknown Functionalities},
isbn = {978-3-319-25839-3},
abstract = {Computational analyses for biomedical knowledge discovery greatly benefit from the availability of the description of gene and protein functional features expressed through controlled terminologies and ontologies, i.e. of their controlled annotations. In the last years, several databases of such annotations have become available; yet, these annotations are incomplete and only some of them represent highly reliable human curated information. To predict and discover unknown or missing annotations existing approaches use unsupervised learning algorithms. We propose a new learning method that allows applying supervised algorithms to unsupervised problems, achieving much better annotation predictions. This method, which we also extend from our preceding work with data weighting techniques, is based on the generation of artificial labeled training sets through random perturbations of original data. We tested it on nine Gene Ontology annotation datasets; obtained results demonstrate that our approach achieves good effectiveness in novel annotation prediction, outperforming state of the art unsupervised methods.},
pages = {181-197},
volume = 553,
doi = {10.1007/978-3-319-25840-9_12}}

Journals & Series

Tags:

Publication

— authors

Giacomo Domeniconi, Marco Masseroli, Gianluca Moro, Pietro Pinoli

— editors

Ana Fred, Jan L. G. Dietz, David Aveiro, Kecheng Liu, Joaquim Filipe

— status

published

— sort

paper in proceedings

Venue

— volume

Knowledge Discovery, Knowledge Engineering and Knowledge Management

— series

Communications in Computer and Information Science 553

— publication date

2015

— pages

181-197

— series

Communications in Computer and Information Science 553

— publication date

2015

URLs & IDs

original page

— DOI

10.1007/978-3-319-25840-9_12

— print ISBN

978-3-319-25839-3

BibTeX

— BibTeX category
article

Partita IVA: 01131710376 - Copyright © 2008-2021 APICe@DISI Research Group - PRIVACY