EPTAS for k-means Clustering of Affine Subspaces

Eduard Eiben, Fedor V. Fomin, Petr A. Golovach, William Lochet, Fahad Panolan, Kirill Simonov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider a generalization of the fundamental k-means clustering for data with incomplete or corrupted entries. When data objects are represented by points in ℝ^d, a data point is said to be incomplete when some of its entries are missing or unspecified. An incomplete data point with at most Δ unspecified entries corresponds to an axis-parallel affine subspace of dimension at most Δ, called a Δ-point. Thus we seek a partition of n input Δ-points into k clusters minimizing the k-means objective. For Δ = 0, when all coordinates of each point are specified, this is the usual k-means clustering. We give an algorithm that finds an (1 + ∊)-approximate solution in time f(k, ∊, Δ) · n2 · d for some function f of k, ∊, and Δ only.
Original languageEnglish
Title of host publicationProceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA)
PublisherSIAM
Pages2649-2659
Number of pages11
ISBN (Electronic)978-1-61197-646-5
DOIs
Publication statusPublished - 7 Jan 2021

Cite this