## Abstract

Venn predictors are a distribution-free probabilistic prediction framework that transforms the output of a scoring classifier into a (multi-)probabilistic prediction that has calibration guarantees, with the only requirement of an i.i.d. assumption for calibration and test data.

In this paper, we extend the framework from classification (where probabilities are predicted for a discrete number of labels) to regression (where labels form a continuum). We show how Venn Predictors can be applied on top of any regression method to obtain calibrated predictive distributions, without requiring assumptions beyond i.i.d. of calibration and test sets. This is contrasted with methods such as Bayesian Linear Regression, for which the calibration guarantee instead relies on specific probabilistic assumptions on the distribution of the data.

The adaptation of Venn Machine to regression required a theoretical analysis of the transductive and inductive forms of the predictor. We identify potential consistency problems and provide solutions for them.

Finally, to illustrate their advantages, we apply regression Venn Predictors to the medical problem of predicting the survival time after Percutaneous Coronary Intervention, a potentially risky procedure that improves blood flow to a patient’s heart. The predictive distributions obtained with this method allow a variety of interpretations that include probability of survival time exceeding a chosen threshold or the shortest survival time guaranteed with a given probability.

In this paper, we extend the framework from classification (where probabilities are predicted for a discrete number of labels) to regression (where labels form a continuum). We show how Venn Predictors can be applied on top of any regression method to obtain calibrated predictive distributions, without requiring assumptions beyond i.i.d. of calibration and test sets. This is contrasted with methods such as Bayesian Linear Regression, for which the calibration guarantee instead relies on specific probabilistic assumptions on the distribution of the data.

The adaptation of Venn Machine to regression required a theoretical analysis of the transductive and inductive forms of the predictor. We identify potential consistency problems and provide solutions for them.

Finally, to illustrate their advantages, we apply regression Venn Predictors to the medical problem of predicting the survival time after Percutaneous Coronary Intervention, a potentially risky procedure that improves blood flow to a patient’s heart. The predictive distributions obtained with this method allow a variety of interpretations that include probability of survival time exceeding a chosen threshold or the shortest survival time guaranteed with a given probability.

Original language | English |
---|---|

Title of host publication | 7th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2018) |

Pages | 15-36 |

Number of pages | 22 |

Volume | 91 |

Publication status | Published - Jun 2018 |

Event | The 7th Symposium on Conformal and Probabilistic Prediction with Applications: COPA 2018 - Maastricht, Netherlands Duration: 11 Jun 2018 → 13 Jun 2018 http://www.clrc.rhul.ac.uk/copa2018/index.html |

### Publication series

Name | Proceedings of Machine Learning Research |
---|---|

ISSN (Electronic) | 1938-7228 |

### Conference

Conference | The 7th Symposium on Conformal and Probabilistic Prediction with Applications |
---|---|

Country/Territory | Netherlands |

City | Maastricht |

Period | 11/06/18 → 13/06/18 |

Internet address |

## Keywords

- reliable prediction
- Venn machine
- regression