This paper proposes general methods for the problem of multiple testing of a single hypothesis, with a standard goal of combining a number of p-values without making any assumptions about their dependence structure. An old result by Ruschendorf and, independently, Meng implies that the p-values can be combined by scaling up their arithmetic mean by a factor of 2, and no smaller factor is sufficient in general. A similar result by Mattner about the geometric mean replaces 2 by e. Based on more recent developments in mathematical finance, specifically, robust risk aggregation techniques, we extend these results to generalized means; in particular, we show that K p-values can be combined by scaling up their harmonic mean by a factor of \log K asymptotically as K\to\infty. This leads to a generalized version of the Bonferroni-Holm procedure. We also explore methods using weighted averages of p-values. Finally, we discuss the efficiency of various methods of combining p-values and how to choose a suitable method in light of data and prior information.
- Hypothesis testing
- Multiple hypothesis testing
- Multiple testing of a single hypothesis
- Robust risk aggregation