Abstract
The paper presents an application of Conformal Predictors to a chemoinformatics problem of predicting the biological activities of chemical compounds. The paper addresses some specific challenges in this domain: a large number of compounds (training examples), highdimensionality of feature space, sparseness and a strong class imbalance. A variant of conformal predictors called Inductive Mondrian Conformal Predictor is applied to deal with these challenges. Results are presented for several non-conformity measures extracted from underlying algorithms and different kernels. A number of performance measures are used in order to demonstrate the flexibility of Inductive Mondrian Conformal Predictors in dealing with such a complex set of data. This approach allowed us to identify the most likely active compounds for a given biological target and present them in a ranking order.
Original language | English |
---|---|
Pages (from-to) | 105–123 |
Number of pages | 19 |
Journal | Annals of Mathematics and Artificial Intelligence |
Volume | 81 |
Early online date | 16 Jun 2017 |
DOIs | |
Publication status | Published - Oct 2017 |
Keywords
- Conformal Prediction
- Confidence Estimation
- Chemoinformatics
- Non-Conformity Measure