Abstract
The ultrametric properties of hierarchic clustering are well-known. In recent years, there has been interest in ultrametric properties found in statistical mechanics, optimization theory, and physics. It has been shown that sparse, high-dimensional spaces tend to be ultrametric. Given the pervasiveness of ultrametricity, it is important to be able to quantify how close given metric data are to being ultrametric. In this article we assess previously used coefficients of ultrametricity. We present a new coefficient of ultrametricity, and exemplify its properties experimentally. Our immediate objective in this work is to show that sparse, high-dimensional spaces, that are typical of many new data analysis problems in such areas as genomics and proteomics, and speech, tend to be inherently ultrametric.
Original language | English |
---|---|
Title of host publication | Compstat 2004: Proceedings in Computational Statistics |
Place of Publication | Berlin |
Publisher | Springer-Verlag |
Pages | 1561-1568 |
ISBN (Print) | 3790815543 |
Publication status | Published - 2004 |
Keywords
- Ultrametricity
- ultrametric
- coeffficients
- genomics
- proteomics
- metric data