Abstract
The computational annotation of proteins has become a crucial step to the functional characterisation of genomes. Many computational methods predict protein function by exploiting experimental data such as protein-protein interactions and gene expression. For newly sequenced organisms these experiments are not available, limiting the feasible tools to sequence-based techniques.
In this thesis, I approach the problem of predicting protein function for newly sequenced organisms in three different ways. First, by exploiting the "guilt by association" principle in the context of protein-protein networks. Second, by elucidating the domain architecture of proteins and associating them with functions. Finally, by identifying protein complexes and the function enriched in every complex. Each approach considers different aspects of the problem and a wide variety of techniques are applied to address them. These techniques share the fundamental property of transferring information from well-studied organisms to those that are barely characterised, if at all.
In this thesis, I approach the problem of predicting protein function for newly sequenced organisms in three different ways. First, by exploiting the "guilt by association" principle in the context of protein-protein networks. Second, by elucidating the domain architecture of proteins and associating them with functions. Finally, by identifying protein complexes and the function enriched in every complex. Each approach considers different aspects of the problem and a wide variety of techniques are applied to address them. These techniques share the fundamental property of transferring information from well-studied organisms to those that are barely characterised, if at all.
Original language | English |
---|---|
Qualification | Ph.D. |
Awarding Institution |
|
Supervisors/Advisors |
|
Thesis sponsors | |
Award date | 1 May 2020 |
Publication status | Unpublished - 2019 |
Keywords
- Bioinformatics
- Protein function prediction
- homology
- Computational Biology