Abstract
As the volume of data increase rapidly, most traditional machine learning algorithms become computationally prohibitive. Furthermore, the available data can be so big that a single machine's memory can easily be overflown.We propose Coreset-Based Conformal Prediction, a strategy for dealing with big data by applying conformal predictors to a weighted summary of data - namely the coreset. We compare our approach against stand-alone inductive conformal predictors over three large competition-grade datasets to demonstrate that our coreset-based strategy may not only significantly improve the learning speed, but also retains predictions validity and the predictors' efficiency.
Original language | English |
---|---|
Pages | 142--162 |
Publication status | Published - 2019 |
Keywords
- logistic regression
- conformal predictors
- importance sampling