The use of EST expression matrices for the quality control of gene expression data and the development of improved algorithms for gene expression profiling in cancer

Andrew Milnthorpe

Research output: ThesisDoctoral Thesis

596 Downloads (Pure)


There are currently a few bioinformatics tools, such as dbEST, DDD and GEPIS to name a few, which have been widely used to retrieve and analyse EST data for gene expression levels. The Cancer Genome Anatomy Project (CGAP, run by NCBI) cDNA xProfiler and cDNA DGED tools can be used to examine EST to compare gene expression levels between cancer and normal tissue. However, neither CGAP nor other similar tools provide an easy way to compare expression in normal and cancerous tissue with e.g. expression levels in related or proximal tissues at the same time while also presenting that data for study separately. Furthermore, the expression data are often assumed to be correct and no quality control tools are made available at CGAP, dbEST and GEPIS. In this study the CGAP tools were recreated with the aim of enabling a wider range of tissues to be searched and compared in a single search. The CGAP tools were found to contain many errors in their library and gene parsing algorithms, for which solutions were implemented in the recreated algorithms. A method was also devised for the tissue origin of EST libraries to be verified and for the uncharacterised libraries to be annotated with a likely tissue of origin using EST data alone. An initial list of tissue-specific genes was optimised to create gene expression matrices which could be used to determine the tissue origin of a library. The matrices were demonstrated to show potential for cancer staging and for the indication of the degree of normalisation of a library in addition to tissue typing, making tissue-specific expression a suitable quality control method for expression data. Together the improved expression profiling algorithm and the expression matrices provide new tools to assess the quality of EST data and their suitability for expression profiling.
Original languageEnglish
Awarding Institution
  • Royal Holloway, University of London
  • Soloviev, Mikhail, Supervisor
  • Rider, Chris, Advisor
Thesis sponsors
Award date1 Apr 2013
Publication statusUnpublished - 2013


  • Computational Biology
  • Genomics
  • Comparative genomics
  • Genome analysis tools
  • Transcriptomes
  • Genome expression analysis
  • Molecular genetics
  • Gene expression
  • cancer
  • Biochemical Research Methods
  • Biotechnology & Applied Microbiology
  • Mathematical & Computational Biology
  • Normalised cDNA library
  • Normalized cDNA library
  • Tissue expression
  • Subtraction
  • Discovery
  • Profiles
  • Hybridisation
  • Hybridization
  • Construction

Cite this