Clustering in massive data sets

Fionn Murtagh; J Abello; P M Pardalos; M G C Resende

Clustering in massive data sets

Fionn Murtagh, J Abello (Editor), P M Pardalos (Editor), M G C Resende (Editor)

Department of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Chapter

504 Downloads (Pure)

Abstract

We review the time and storage costs of search and clustering algorithms. We exemplify these, based on case-studies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Theoretical results developed as far back as the 1960s still very often remain topical. More recent work is also covered in this article. This includes a solution for the statistical question of how many clusters there are in a dataset. We also look at one line of inquiry in the use of clustering for human-computer user interfaces. Finally, the visualization of data leads to the consideration of data arrays as images, and we speculate on future results to be expected here.

Original language	English
Title of host publication	Handbook of Massive Data Sets
Place of Publication	Norwell, MA, USA
Publisher	Kluwer
Pages	401-545
ISBN (Print)	1 4020 0489 3
Publication status	Published - 2002

Access to Document

Full Text

http://portal.acm.org/citation.cfm?id=779247&dl=ACM&coll=GUIDE#references

Cite this

@inbook{655a60954b4a41c2942b7cf8a6a9484d,

title = "Clustering in massive data sets",

abstract = "We review the time and storage costs of search and clustering algorithms. We exemplify these, based on case-studies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Theoretical results developed as far back as the 1960s still very often remain topical. More recent work is also covered in this article. This includes a solution for the statistical question of how many clusters there are in a dataset. We also look at one line of inquiry in the use of clustering for human-computer user interfaces. Finally, the visualization of data leads to the consideration of data arrays as images, and we speculate on future results to be expected here.",

author = "Fionn Murtagh and J Abello and Pardalos, {P M} and Resende, {M G C}",

year = "2002",

language = "English",

isbn = "1 4020 0489 3",

pages = "401--545",

booktitle = "Handbook of Massive Data Sets",

publisher = "Kluwer",

}

TY - CHAP

T1 - Clustering in massive data sets

AU - Murtagh, Fionn

A2 - Abello, J

A2 - Pardalos, P M

A2 - Resende, M G C

PY - 2002

Y1 - 2002

N2 - We review the time and storage costs of search and clustering algorithms. We exemplify these, based on case-studies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Theoretical results developed as far back as the 1960s still very often remain topical. More recent work is also covered in this article. This includes a solution for the statistical question of how many clusters there are in a dataset. We also look at one line of inquiry in the use of clustering for human-computer user interfaces. Finally, the visualization of data leads to the consideration of data arrays as images, and we speculate on future results to be expected here.

AB - We review the time and storage costs of search and clustering algorithms. We exemplify these, based on case-studies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Theoretical results developed as far back as the 1960s still very often remain topical. More recent work is also covered in this article. This includes a solution for the statistical question of how many clusters there are in a dataset. We also look at one line of inquiry in the use of clustering for human-computer user interfaces. Finally, the visualization of data leads to the consideration of data arrays as images, and we speculate on future results to be expected here.

M3 - Chapter

SN - 1 4020 0489 3

SP - 401

EP - 545

BT - Handbook of Massive Data Sets

PB - Kluwer

CY - Norwell, MA, USA

ER -