Usability of reference-free transcriptome assemblies for detection of differential expression : a case study on Aethionema arabicum dimorphic seeds. / Wilhelmsson, Per; Chandler, Jake; Fernandez-Pozo, Noe; Graeber, Kai; Ullrich, Kristian; Arshad, Waheed; Khan, Safina; Hofberger, Johannes; Buchta, Karol; Edger, Patrick; Pires, Chris; Schranz, Michael Eric; Leubner-Metzger, Gerhard; Rensing, Stefan.

In: BMC Genomics, Vol. 20, 95, 30.01.2019, p. 1-19.

Research output: Contribution to journalArticle

Published

Standard

Usability of reference-free transcriptome assemblies for detection of differential expression : a case study on Aethionema arabicum dimorphic seeds. / Wilhelmsson, Per; Chandler, Jake; Fernandez-Pozo, Noe; Graeber, Kai; Ullrich, Kristian; Arshad, Waheed; Khan, Safina; Hofberger, Johannes; Buchta, Karol; Edger, Patrick; Pires, Chris; Schranz, Michael Eric; Leubner-Metzger, Gerhard; Rensing, Stefan.

In: BMC Genomics, Vol. 20, 95, 30.01.2019, p. 1-19.

Research output: Contribution to journalArticle

Harvard

APA

Vancouver

Author

Wilhelmsson, Per ; Chandler, Jake ; Fernandez-Pozo, Noe ; Graeber, Kai ; Ullrich, Kristian ; Arshad, Waheed ; Khan, Safina ; Hofberger, Johannes ; Buchta, Karol ; Edger, Patrick ; Pires, Chris ; Schranz, Michael Eric ; Leubner-Metzger, Gerhard ; Rensing, Stefan. / Usability of reference-free transcriptome assemblies for detection of differential expression : a case study on Aethionema arabicum dimorphic seeds. In: BMC Genomics. 2019 ; Vol. 20. pp. 1-19.

BibTeX

@article{55262dd55e6e46e283db0ff4e74786d8,
title = "Usability of reference-free transcriptome assemblies for detection of differential expression: a case study on Aethionema arabicum dimorphic seeds",
abstract = "Background: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy – producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference- genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs). Results: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1{\%} complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9{\%} for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37{\%} of 1,533 differentially expressed de novo assembled transcripts paired with 1,876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1,901 transcriptomic DEG set GO-terms had almost 90{\%} overlap with the 2,191 genome-derived DEG GO-terms. Conclusions: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.",
author = "Per Wilhelmsson and Jake Chandler and Noe Fernandez-Pozo and Kai Graeber and Kristian Ullrich and Waheed Arshad and Safina Khan and Johannes Hofberger and Karol Buchta and Patrick Edger and Chris Pires and Schranz, {Michael Eric} and Gerhard Leubner-Metzger and Stefan Rensing",
note = "This is CC-BY",
year = "2019",
month = "1",
day = "30",
doi = "10.1186/s12864-019-5452-4",
language = "English",
volume = "20",
pages = "1--19",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

RIS

TY - JOUR

T1 - Usability of reference-free transcriptome assemblies for detection of differential expression

T2 - a case study on Aethionema arabicum dimorphic seeds

AU - Wilhelmsson, Per

AU - Chandler, Jake

AU - Fernandez-Pozo, Noe

AU - Graeber, Kai

AU - Ullrich, Kristian

AU - Arshad, Waheed

AU - Khan, Safina

AU - Hofberger, Johannes

AU - Buchta, Karol

AU - Edger, Patrick

AU - Pires, Chris

AU - Schranz, Michael Eric

AU - Leubner-Metzger, Gerhard

AU - Rensing, Stefan

N1 - This is CC-BY

PY - 2019/1/30

Y1 - 2019/1/30

N2 - Background: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy – producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference- genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs). Results: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9% for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37% of 1,533 differentially expressed de novo assembled transcripts paired with 1,876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1,901 transcriptomic DEG set GO-terms had almost 90% overlap with the 2,191 genome-derived DEG GO-terms. Conclusions: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.

AB - Background: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy – producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference- genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs). Results: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9% for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37% of 1,533 differentially expressed de novo assembled transcripts paired with 1,876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1,901 transcriptomic DEG set GO-terms had almost 90% overlap with the 2,191 genome-derived DEG GO-terms. Conclusions: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.

U2 - 10.1186/s12864-019-5452-4

DO - 10.1186/s12864-019-5452-4

M3 - Article

VL - 20

SP - 1

EP - 19

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 95

ER -