Abstract
The paper presents an application of Conformal Predictors to a chemoinformatics problem of predicting the biological activities of chemical compounds. The paper addresses some specific challenges in this domain: a large number of compounds (training examples), highdimensionality of feature space, sparseness and a strong class imbalance. A variant of conformal predictors called Inductive Mondrian Conformal Predictor is applied to deal with these challenges. Results are presented for several non-conformity measures extracted from underlying algorithms and different kernels. A number of performance measures are used in order to demonstrate the flexibility of Inductive Mondrian Conformal Predictors in dealing with such a complex set of data. This approach allowed us to identify the most likely active compounds for a given biological target and present them in a ranking order.
| Original language | English |
|---|---|
| Pages (from-to) | 105–123 |
| Number of pages | 19 |
| Journal | Annals of Mathematics and Artificial Intelligence |
| Volume | 81 |
| Early online date | 16 Jun 2017 |
| DOIs | |
| Publication status | Published - Oct 2017 |
Keywords
- Conformal Prediction
- Confidence Estimation
- Chemoinformatics
- Non-Conformity Measure