Learning sparse representations for predicting drug side effects, disease genes and customer preferences

Diego Galeano Galeano

Learning sparse representations for predicting drug side effects, disease genes and customer preferences

Diego Galeano Galeano

Department of Computer Science

Research output: Thesis › Doctoral Thesis

114 Downloads (Pure)

Abstract

Computational prediction methods that operate on pairs of objects are fundamental tools
for understanding and modelling complex systems in biology, chemistry, and customer
preference in recommender systems. I present four sparse matrix completion models to
learn a sparse representation of objects from data consisting of associations between pairs
of objects. The main goal of my models is to be able to generalise, that is, to predict new
relationships between a pair of objects. This thesis addresses the following problems: (1)
drug-side effect frequency prediction; (2) drug-side effect prediction; (3) disease-gene prediction; and (4) user preference prediction in top-N recommender systems. I show how my
sparse matrix completion models can be effectively used to predict missing relationships in the data; better than other state-of-the-art methods. My models are designed to favour interpretability. On the task of predicting the frequencies of drug side effects, I show a new
algorithm for non-negative matrix factorisation that learns parts of the human anatomical
system. On the task of predicting the presence/absence of drug side effects, I show a new algorithm that learns sparse self-representation of objects such that a given object, e.g. a side effect is represented by the linear combination of few other objects. In addition, my models naturally integrate structure knowledge in the form of graph networks, adding strong relational inductive biases without requiring well-defined heuristics or hand-crafted features.

Original language	English
Qualification	Ph.D.
Awarding Institution	Royal Holloway, University of London
Supervisors/Advisors	Paccanaro, Alberto, Supervisor
Thesis sponsors	Becas Don Carlos Antonio Lopez (BECAL) - Paraguayan Government
Award date	1 Mar 2020
Publication status	Unpublished - 2020

Keywords

drug side effects
prediction
disease genes
recommender systems

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

Diego Galeano PhD ThesisOther version, 9.35 MBLicence: CC BY-NC

Cite this

@phdthesis{82d0b913cb3442cebba5dfc8d4801afe,

title = "Learning sparse representations for predicting drug side effects, disease genes and customer preferences",

abstract = "Computational prediction methods that operate on pairs of objects are fundamental toolsfor understanding and modelling complex systems in biology, chemistry, and customerpreference in recommender systems. I present four sparse matrix completion models tolearn a sparse representation of objects from data consisting of associations between pairsof objects. The main goal of my models is to be able to generalise, that is, to predict newrelationships between a pair of objects. This thesis addresses the following problems: (1)drug-side effect frequency prediction; (2) drug-side effect prediction; (3) disease-gene prediction; and (4) user preference prediction in top-N recommender systems. I show how mysparse matrix completion models can be effectively used to predict missing relationships in the data; better than other state-of-the-art methods. My models are designed to favour interpretability. On the task of predicting the frequencies of drug side effects, I show a newalgorithm for non-negative matrix factorisation that learns parts of the human anatomicalsystem. On the task of predicting the presence/absence of drug side effects, I show a new algorithm that learns sparse self-representation of objects such that a given object, e.g. a side effect is represented by the linear combination of few other objects. In addition, my models naturally integrate structure knowledge in the form of graph networks, adding strong relational inductive biases without requiring well-defined heuristics or hand-crafted features.",

keywords = "drug side effects, prediction, disease genes, recommender systems",

author = "{Galeano Galeano}, Diego",

year = "2020",

language = "English",

school = "Royal Holloway, University of London",

}

TY - BOOK

T1 - Learning sparse representations for predicting drug side effects, disease genes and customer preferences

AU - Galeano Galeano, Diego

PY - 2020

Y1 - 2020

N2 - Computational prediction methods that operate on pairs of objects are fundamental toolsfor understanding and modelling complex systems in biology, chemistry, and customerpreference in recommender systems. I present four sparse matrix completion models tolearn a sparse representation of objects from data consisting of associations between pairsof objects. The main goal of my models is to be able to generalise, that is, to predict newrelationships between a pair of objects. This thesis addresses the following problems: (1)drug-side effect frequency prediction; (2) drug-side effect prediction; (3) disease-gene prediction; and (4) user preference prediction in top-N recommender systems. I show how mysparse matrix completion models can be effectively used to predict missing relationships in the data; better than other state-of-the-art methods. My models are designed to favour interpretability. On the task of predicting the frequencies of drug side effects, I show a newalgorithm for non-negative matrix factorisation that learns parts of the human anatomicalsystem. On the task of predicting the presence/absence of drug side effects, I show a new algorithm that learns sparse self-representation of objects such that a given object, e.g. a side effect is represented by the linear combination of few other objects. In addition, my models naturally integrate structure knowledge in the form of graph networks, adding strong relational inductive biases without requiring well-defined heuristics or hand-crafted features.

AB - Computational prediction methods that operate on pairs of objects are fundamental toolsfor understanding and modelling complex systems in biology, chemistry, and customerpreference in recommender systems. I present four sparse matrix completion models tolearn a sparse representation of objects from data consisting of associations between pairsof objects. The main goal of my models is to be able to generalise, that is, to predict newrelationships between a pair of objects. This thesis addresses the following problems: (1)drug-side effect frequency prediction; (2) drug-side effect prediction; (3) disease-gene prediction; and (4) user preference prediction in top-N recommender systems. I show how mysparse matrix completion models can be effectively used to predict missing relationships in the data; better than other state-of-the-art methods. My models are designed to favour interpretability. On the task of predicting the frequencies of drug side effects, I show a newalgorithm for non-negative matrix factorisation that learns parts of the human anatomicalsystem. On the task of predicting the presence/absence of drug side effects, I show a new algorithm that learns sparse self-representation of objects such that a given object, e.g. a side effect is represented by the linear combination of few other objects. In addition, my models naturally integrate structure knowledge in the form of graph networks, adding strong relational inductive biases without requiring well-defined heuristics or hand-crafted features.

KW - drug side effects

KW - prediction

KW - disease genes

KW - recommender systems

M3 - Doctoral Thesis

ER -