The Corrected Gene Proximity map for analyzing the 3D genome organization using Hi-C data. / Ye, Cheng; Paccanaro, Alberto; Gerstein, Mark; Yan, Koon-Kiu.

In: BMC Bioinformatics, Vol. 21, 222, 29.05.2020, p. 1-18.

Research output: Contribution to journalArticlepeer-review

Published

Standard

The Corrected Gene Proximity map for analyzing the 3D genome organization using Hi-C data. / Ye, Cheng; Paccanaro, Alberto; Gerstein, Mark; Yan, Koon-Kiu.

In: BMC Bioinformatics, Vol. 21, 222, 29.05.2020, p. 1-18.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

Ye, Cheng ; Paccanaro, Alberto ; Gerstein, Mark ; Yan, Koon-Kiu. / The Corrected Gene Proximity map for analyzing the 3D genome organization using Hi-C data. In: BMC Bioinformatics. 2020 ; Vol. 21. pp. 1-18.

BibTeX

@article{e29e9ddf3c834480b6ef61ddd3de502e,
title = "The Corrected Gene Proximity map for analyzing the 3D genome organization using Hi-C data",
abstract = "Background: Genome-wide ligation-based assays such as Hi-C provide us with an unprecedented opportunity to investigate the spatial organization of the genome. Results of a typical Hi-C experiment are often summarized in a chromosomal contact map, a matrix whose elements reflect the co-location frequencies of genomic loci. To elucidate the complex structural and functional interactions between those genomic loci, networks offer a natural and powerful framework.Results: We propose a novel graph-theoretical framework, the Corrected Gene Proximity (CGP) map to study the effect of the 3D spatial organization of genes in transcriptional regulation. The starting point of the CGP map is a weighted network, the gene proximity map, whose weights are based on the contact frequencies between genes extracted from genome-wide Hi-C data. We derive a null model for the network based on the signal contributed by the 1D genomic distance and use it to “correct” the gene proximity for cell type 3D specific arrangements. The CGP map, therefore, provides a network framework for the 3D structure of the genome on a global scale. On human cell lines, we show that the CGP map can detect and quantify gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies. Analysing the expression pattern of metabolic pathways of two hematopoietic cell lines, we find that the relative positioning of the genes, as captured and quantified by the CGP, is highly correlated with their expression change. We further show that the CGP map can be used to form an inter-chromosomal proximity map that allows large-scale abnormalities, such as chromosomal translocations, to be identified. Conclusions: The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale. It allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression. The flexible graph-based formalism of the CGP map can be easily generalized to study any existing Hi-C datasets.",
keywords = "3D genome, Hi-C data analysis, network modularity, network theory",
author = "Cheng Ye and Alberto Paccanaro and Mark Gerstein and Koon-Kiu Yan",
year = "2020",
month = may,
day = "29",
doi = "10.1186/s12859-020-03545-y",
language = "English",
volume = "21",
pages = "1--18",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

RIS

TY - JOUR

T1 - The Corrected Gene Proximity map for analyzing the 3D genome organization using Hi-C data

AU - Ye, Cheng

AU - Paccanaro, Alberto

AU - Gerstein, Mark

AU - Yan, Koon-Kiu

PY - 2020/5/29

Y1 - 2020/5/29

N2 - Background: Genome-wide ligation-based assays such as Hi-C provide us with an unprecedented opportunity to investigate the spatial organization of the genome. Results of a typical Hi-C experiment are often summarized in a chromosomal contact map, a matrix whose elements reflect the co-location frequencies of genomic loci. To elucidate the complex structural and functional interactions between those genomic loci, networks offer a natural and powerful framework.Results: We propose a novel graph-theoretical framework, the Corrected Gene Proximity (CGP) map to study the effect of the 3D spatial organization of genes in transcriptional regulation. The starting point of the CGP map is a weighted network, the gene proximity map, whose weights are based on the contact frequencies between genes extracted from genome-wide Hi-C data. We derive a null model for the network based on the signal contributed by the 1D genomic distance and use it to “correct” the gene proximity for cell type 3D specific arrangements. The CGP map, therefore, provides a network framework for the 3D structure of the genome on a global scale. On human cell lines, we show that the CGP map can detect and quantify gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies. Analysing the expression pattern of metabolic pathways of two hematopoietic cell lines, we find that the relative positioning of the genes, as captured and quantified by the CGP, is highly correlated with their expression change. We further show that the CGP map can be used to form an inter-chromosomal proximity map that allows large-scale abnormalities, such as chromosomal translocations, to be identified. Conclusions: The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale. It allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression. The flexible graph-based formalism of the CGP map can be easily generalized to study any existing Hi-C datasets.

AB - Background: Genome-wide ligation-based assays such as Hi-C provide us with an unprecedented opportunity to investigate the spatial organization of the genome. Results of a typical Hi-C experiment are often summarized in a chromosomal contact map, a matrix whose elements reflect the co-location frequencies of genomic loci. To elucidate the complex structural and functional interactions between those genomic loci, networks offer a natural and powerful framework.Results: We propose a novel graph-theoretical framework, the Corrected Gene Proximity (CGP) map to study the effect of the 3D spatial organization of genes in transcriptional regulation. The starting point of the CGP map is a weighted network, the gene proximity map, whose weights are based on the contact frequencies between genes extracted from genome-wide Hi-C data. We derive a null model for the network based on the signal contributed by the 1D genomic distance and use it to “correct” the gene proximity for cell type 3D specific arrangements. The CGP map, therefore, provides a network framework for the 3D structure of the genome on a global scale. On human cell lines, we show that the CGP map can detect and quantify gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies. Analysing the expression pattern of metabolic pathways of two hematopoietic cell lines, we find that the relative positioning of the genes, as captured and quantified by the CGP, is highly correlated with their expression change. We further show that the CGP map can be used to form an inter-chromosomal proximity map that allows large-scale abnormalities, such as chromosomal translocations, to be identified. Conclusions: The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale. It allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression. The flexible graph-based formalism of the CGP map can be easily generalized to study any existing Hi-C datasets.

KW - 3D genome

KW - Hi-C data analysis

KW - network modularity

KW - network theory

U2 - 10.1186/s12859-020-03545-y

DO - 10.1186/s12859-020-03545-y

M3 - Article

VL - 21

SP - 1

EP - 18

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 222

ER -