Fast probabilistic prediction for kernel SVM via enclosing balls. / Riquelme-Granada, Nery; Nguyen, Dr. Khuong An; Luo, Zhiyuan.

Proceedings of Machine Learning Research: Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications. Vol. 128 2020. p. 189-208.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Published

Abstract

Support Vector Machine (SVM) is a powerful paradigm that has proven to be extremely useful for the task of classifying high-dimensional objects. It does not only perform well in learning linear classifiers, but also shows outstanding performance in capturing non- linearity through the use of kernels. In principle, SVM allows us to train “scoring” classifiers i.e. classifiers that output a prediction score. However, it can also be adapted to produce probability-type outputs through the use of the Venn-Abers framework. This allows us to obtain valuable information on the labels distribution for each test object. This procedure, however, is restricted to very small data given its inherent computational complexity. We circumvent this limitation by borrowing results from the field of computational geometry. Specifically, we make use of the concept of a coreset: a small summary of data that is constructed by discretising the input space into enclosing balls, so that each ball will be represented by only one object. Our results indicate that training Venn-Abers predictors using enclosing balls provides an average acceleration of 8 times compared to the regu- lar Venn-Abers approach while largely retaining probability calibration. These promising results imply that we can still enjoy well-calibrated probabilistic outputs for kernel SVM even in the realm of large-scale datasets.
Original languageEnglish
Title of host publicationProceedings of Machine Learning Research
Subtitle of host publicationProceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications
Pages189-208
Number of pages20
Volume128
Publication statusPublished - 2020
This open access research output is licenced under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

ID: 39032188