Adaptive Hedge. / van Erven, Tim; Grünwald, Peter; M. Koolen, Wouter; de Rooij, Steven.

2011.

Research output: Working paper

Published

Standard

Adaptive Hedge. / van Erven, Tim; Grünwald, Peter; M. Koolen, Wouter; de Rooij, Steven.

2011.

Research output: Working paper

Harvard

van Erven, T, Grünwald, P, M. Koolen, W & de Rooij, S 2011 'Adaptive Hedge'.

APA

van Erven, T., Grünwald, P., M. Koolen, W., & de Rooij, S. (2011). Adaptive Hedge.

Vancouver

van Erven T, Grünwald P, M. Koolen W, de Rooij S. Adaptive Hedge. 2011 Oct 28.

Author

van Erven, Tim ; Grünwald, Peter ; M. Koolen, Wouter ; de Rooij, Steven. / Adaptive Hedge. 2011.

BibTeX

@techreport{0fedef2275064f72820b48a76b93b600,
title = "Adaptive Hedge",
abstract = "Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.",
keywords = "stat.ML",
author = "{van Erven}, Tim and Peter Gr{\"u}nwald and {M. Koolen}, Wouter and {de Rooij}, Steven",
note = "This is the full version of the paper with the same name that will appear in Advances in Neural Information Processing Systems 24 (NIPS 2011), 2012. The two papers are identical, except that this version contains an extra section of Additional Material",
year = "2011",
month = oct,
day = "28",
language = "Undefined/Unknown",
type = "WorkingPaper",

}

RIS

TY - UNPB

T1 - Adaptive Hedge

AU - van Erven, Tim

AU - Grünwald, Peter

AU - M. Koolen, Wouter

AU - de Rooij, Steven

N1 - This is the full version of the paper with the same name that will appear in Advances in Neural Information Processing Systems 24 (NIPS 2011), 2012. The two papers are identical, except that this version contains an extra section of Additional Material

PY - 2011/10/28

Y1 - 2011/10/28

N2 - Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.

AB - Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.

KW - stat.ML

M3 - Working paper

BT - Adaptive Hedge

ER -