Faithful Visualisation of Similarities in High Dimensional Data

Jiaxin Kou

Faithful Visualisation of Similarities in High Dimensional Data

Jiaxin Kou

Department of Computer Science

Research output: Thesis › Doctoral Thesis

320 Downloads (Pure)

Abstract

In the last fifteen years, new methods of dimension reduction have been invented that enable much improved visualisation of high-dimensional data-sets. Conventionally, the data-sets are visualised as two-dimensional scatterplots, and similarity relationships between data-cases are revealed by grouping and proximity of points in the plane. But the arrangement of points in a 2D scatterplot cannot faithfully represent complex high-dimensional structure: more expressive 2D visualisations are needed.

This thesis develops new types of diagram that can represent data-similarities more expres- sively than a mere scatterplot. The approach is to automatically select a graph to overlay on the scatterplot, in order to enable a richer visualisation of similarities than is possible by the arrangement of points alone, and to correct distortions inherent in scatterplot visualisation.

Methods and software are developed for selecting and graphically representing the overlay graph as a diagram that humans can read. These diagrams enable correct and informative human interpretation of scatterplots that would otherwise be hard to interpret or misleading.

Original language	English
Qualification	Ph.D.
Awarding Institution	Royal Holloway, University of London
Supervisors/Advisors	Watkins, Chris, Supervisor Luo, Zhiyuan, Supervisor
Award date	1 Dec 2016
Publication status	Unpublished - 2016

Keywords

High Dimensional Data
Visualisation
Graph Theory
Machine Learning
Manifold Learning
Dimensionality Reduction
Overlay Graph

Access to Document

Jiaxin Kou PhD thesisOther version, 28 MB

Cite this

@phdthesis{675d46d9bc6d4c1cab69bc6c56153497,

title = "Faithful Visualisation of Similarities in High Dimensional Data",

abstract = "In the last fifteen years, new methods of dimension reduction have been invented that enable much improved visualisation of high-dimensional data-sets. Conventionally, the data-sets are visualised as two-dimensional scatterplots, and similarity relationships between data-cases are revealed by grouping and proximity of points in the plane. But the arrangement of points in a 2D scatterplot cannot faithfully represent complex high-dimensional structure: more expressive 2D visualisations are needed.This thesis develops new types of diagram that can represent data-similarities more expres- sively than a mere scatterplot. The approach is to automatically select a graph to overlay on the scatterplot, in order to enable a richer visualisation of similarities than is possible by the arrangement of points alone, and to correct distortions inherent in scatterplot visualisation.Methods and software are developed for selecting and graphically representing the overlay graph as a diagram that humans can read. These diagrams enable correct and informative human interpretation of scatterplots that would otherwise be hard to interpret or misleading.",

keywords = "High Dimensional Data, Visualisation, Graph Theory, Machine Learning, Manifold Learning, Dimensionality Reduction, Overlay Graph",

author = "Jiaxin Kou",

year = "2016",

language = "English",

school = "Royal Holloway, University of London",

}

TY - BOOK

T1 - Faithful Visualisation of Similarities in High Dimensional Data

AU - Kou, Jiaxin

PY - 2016

Y1 - 2016

N2 - In the last fifteen years, new methods of dimension reduction have been invented that enable much improved visualisation of high-dimensional data-sets. Conventionally, the data-sets are visualised as two-dimensional scatterplots, and similarity relationships between data-cases are revealed by grouping and proximity of points in the plane. But the arrangement of points in a 2D scatterplot cannot faithfully represent complex high-dimensional structure: more expressive 2D visualisations are needed.This thesis develops new types of diagram that can represent data-similarities more expres- sively than a mere scatterplot. The approach is to automatically select a graph to overlay on the scatterplot, in order to enable a richer visualisation of similarities than is possible by the arrangement of points alone, and to correct distortions inherent in scatterplot visualisation.Methods and software are developed for selecting and graphically representing the overlay graph as a diagram that humans can read. These diagrams enable correct and informative human interpretation of scatterplots that would otherwise be hard to interpret or misleading.

AB - In the last fifteen years, new methods of dimension reduction have been invented that enable much improved visualisation of high-dimensional data-sets. Conventionally, the data-sets are visualised as two-dimensional scatterplots, and similarity relationships between data-cases are revealed by grouping and proximity of points in the plane. But the arrangement of points in a 2D scatterplot cannot faithfully represent complex high-dimensional structure: more expressive 2D visualisations are needed.This thesis develops new types of diagram that can represent data-similarities more expres- sively than a mere scatterplot. The approach is to automatically select a graph to overlay on the scatterplot, in order to enable a richer visualisation of similarities than is possible by the arrangement of points alone, and to correct distortions inherent in scatterplot visualisation.Methods and software are developed for selecting and graphically representing the overlay graph as a diagram that humans can read. These diagrams enable correct and informative human interpretation of scatterplots that would otherwise be hard to interpret or misleading.

KW - High Dimensional Data

KW - Visualisation

KW - Graph Theory

KW - Machine Learning

KW - Manifold Learning

KW - Dimensionality Reduction

KW - Overlay Graph

M3 - Doctoral Thesis

ER -