Learning Distributed Representations of Drugs and Side Effects for Predicting Adverse Effects of Drug Combinations

Research output: ThesisDoctoral Thesis

36 Downloads (Pure)


Drug combinations are commonly used to treat complex diseases or co-existing conditions. However, they pose a risk of side effects due to adverse Drug-Drug Interactions (DDIs). During drug development, the detection of DDIs remains an intractable problem mostly due to the relatively small sample size in clinical trials and the sheer number of possible drug combinations. Recently, due to the development of Adverse Event Reporting Systems (AERs), datasets have become available that collect data on adverse side effects caused by drugs already on the market. These resources encouraged the development of machine learning methods to predict side effects caused by pairs of drugs.

In this thesis, I present DCSE-twin (pron. Dixie-twin), the main approach I developed for predicting adverse effects of drug pairs. I also show the early models that led to the development of DCSE-twin.
The main idea behind my approach is to learn distributed representations separately for drugs and side effects.
These are learned in a neural network that also combines non-linearly the representations for single drugs to obtain the representations for drug pairs.
A key finding from my early models is that non-linearity is crucial to model the combination of the individual drugs.

The existence of separate explicit representations for drugs, side effects and drug combinations allows my model to synergistically combine learning from side effect data of both single drugs and drug combinations.
In this way, my model is able to answer new pharmacologically relevant questions, being able to predict side effects for drug pairs for which no side effects are known, or even for drug pairs in which neither of the drugs is known to interact with any other drug (these are known as “cold start” problems). DCSE-twin is tested in a variety of experimental procedures, including prospective evaluations, and its performance is compared against state-of-the-art methods. I also introduce several novel experimental settings in an attempt to approximate real-world scenarios in drug development, pharmacovigilance and drug repurposing. The results presented here show that DCSE-twin outperforms existing state-of-the-art methods on every experimental setting. Moreover, these experiments allow me to shed some light on biological aspects of the problem including the need for non-linearly combining representations for single drugs to obtain the representation for drug pairs. DCSE-twin represents an opportunity to use patient population data to flag and prioritise adverse effects caused by drug combinations for further validation.
Original languageEnglish
Awarding Institution
  • Royal Holloway, University of London
  • Paccanaro, Alberto, Supervisor
Award date1 Dec 2022
Publication statusUnpublished - 7 Jul 2022


  • drug drug interactions
  • distributed representations
  • side effects
  • Machine Learning
  • Deep Learning

Cite this