Leveraging on publicly available databases for novel peptidase target discovery

Seminar – 2016/06/10 – Leveraging on publicly available databases for novel peptidase target discovery

Scoprire nuovi target per l'enzima peptidasi facendo leva su database pubblici

Title:

Leveraging on publicly available databases for novel peptidase target discovery

Seminar language:

English

Speaker:

Dr. Simone Marini

Akutsu Laboratory, Kyoto University

Abstract:

Cleavage is a pivotal aspect of cell metabolism, both cellular and extracellular. It is involved in cell differentiation and cycle control, stress and immune response, removal of abnormally folded proteins and cell death. Wrongly regulated proteolytic activity may result in diseases. Data Fusion consists of the integration of several information sources, including Molecular Biology data (e.g. microarrays, NGS data) and ontologies (e.g. Disease Ontology, Gene Ontology). The Data Fusion approach unveils novel protease-target pairs and protease-disease associations by leveraging on the growing amount of available biological data, such as Disease Ontology, Gene Ontology, STRING, KEGG etc. We are aware, in fact, that a huge amount of indirectly-related information is available in public data sets. Peptidases and targets are both proteins, and share similarities and non-cleavage interactions listed in knowledge bases; they are both encoded by genes, and gene interactions are as well listed in databases. With Data Fusion we exploit this secondary information sources to infer novel peptidase targets.

Bio:

Simone Marini received both his BSc and MSc in Biomedical Engineering at the University of Pavia. He obtained a PhD in Bioengineering at the Hong Kong University of Science and Technology in 2012. In 2013-2015 he was back in Pavia to work for the Biomedical Informatics Laboratory “Mario Stefanelli”, and in 2016 he is working for the Akutsu Laboratory, Kyoto University. His research interest is Machine Learning applied to Bioinformatics and Health Informatics, integrating heterogeneous data ranging from Electronic Health Records to Proteomics and Genomics.