Xiao Ma

Xiao Ma — Postdoc
Joined the group in 2019

Seagrasses (Alismatales) colonized the sea on at least three independent occasions to form the basis of one of the most productive and widespread coastal ecosystems. They are a crucial functional group along the coasts of all continents except Antarctica and are the only fully marine angiosperms, playing a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. To my knowledge, only the Zostera marina genome is available. It lost all of its genes involved in stomatal differentiation and also lost entire pathways encoding volatiles synthesis and sensing in relation to its marine environment. Moreover, Zostera has also regained functions enabling them to adjust to full salinity. We have sequenced more seagrasses from different lineages to reveal new insights into the genomic losses and gains involved in achieving the structural and physiological adaptations required for its marine lifestyle and their convergence evolution.

Ghent University
January 2024 - present: Postdoctoral researcher, Bioinformatics & Evolutionary Genomics, Department of Plant Systems Biology, VIB, Gent, Belgium.
Ghent University
September 2019 - November 2023: PhD student, Bioinformatics & Evolutionary Genomics, Department of Plant Systems Biology, VIB, Gent, Belgium.
Annoroad Gene Technology
July 2018 - May 2019: Bioinformatician, the key Responsibilities is genome assembly and annotation.
Institute of Botany, Chinese Academy of Sciences
September 2014 - June 2018: Master of Science in Biology, State Key Laboratory of Systematic and Evolutionary Botany.
Northwest A&F University
September 2010 - June 2014: Bachelor of Science in Biology, College of Life Science.


  1. Ma, X., Vanneste, S., Chang, J., Ambrosino, L., Barry, K., Bayer, T., … Van de Peer, Y. (2024). Seagrass genomes reveal ancient polyploidy and adaptations to the marine environment. NATURE PLANTS, 10, 240–255. https://doi.org/10.1038/s41477-023-01608-5
    We present chromosome-level genome assemblies from representative species of three independently evolved seagrass lineages: Posidonia oceanica, Cymodocea nodosa, Thalassia testudinum and Zostera marina. We also include a draft genome of Potamogeton acutifolius, belonging to a freshwater sister lineage to Zosteraceae. All seagrass species share an ancient whole-genome triplication, while additional whole-genome duplications were uncovered for C. nodosa, Z. marina and P. acutifolius. Comparative analysis of selected gene families suggests that the transition from submerged-freshwater to submerged-marine environments mainly involved fine-tuning of multiple processes (such as osmoregulation, salinity, light capture, carbon acquisition and temperature) that all had to happen in parallel, probably explaining why adaptation to a marine lifestyle has been exceedingly rare. Major gene losses related to stomata, volatiles, defence and lignification are probably a consequence of the return to the sea rather than the cause of it. These new genomes will accelerate functional studies and solutions, as continuing losses of the 'savannahs of the sea' are of major concern in times of climate change and loss of biodiversity.
  2. Chang, J., Duong, T. A., Schoeman, C., Ma, X., Roodt, D., Barker, N., … Mizrachi, E. (2023). The genome of the king protea, Protea cynaroides. PLANT JOURNAL, 113(2), 262–276. https://doi.org/10.1111/tpj.16044
    The king protea (Protea cynaroides), an early-diverging eudicot, is the most iconic species from the Megadiverse Cape Floristic Region, and the national flower of South Africa. Perhaps best known for its iconic flower head, Protea is a key genus for the South African horticulture industry and cut-flower market. Ecologically, the genus and the family Proteaceae are important models for radiation and adaptation, particularly to soils with limited phosphorus bio-availability. Here, we present a high-quality chromosome-scale assembly of the P. cynaroides genome as the first representative of the Fynbos biome. We reveal an ancestral Whole-Genome Duplication (WGD) event that occurred in the Proteaceae around the late Cretaceous that preceded the divergence of all crown groups within the family and its extant diversity in all Southern continents. The relatively stable genome structure of P. cynaroides is invaluable for comparative studies and for unveiling paleopolyploidy in other groups, such as the distantly related sister group Ranunculales. Comparative genomics in sequenced genomes of the Proteales shows loss of key arbuscular mycorrhizal symbiosis genes likely ancestral to the Family, and possibly the Order. The P. cynaroides genome empowers new research in plant diversification, horticulture, and adaptation, particularly to nutrient-poor soils.
  3. Ma, X. (2023). Annotation and comparative analysis of seagrass and African crop genomes. Ghent University. Faculty of Sciences, Ghent, Belgium.
    Genoomannotatie is een substantiële hoeksteen van elk genoomproject en omvat twee stappen: ten eerste de identificatie van gen-exon-intron-structuren, en ten tweede de toewijzing van genfuncties. De pijplijn van structurele annotatie vormt doorgaans een combinatie van ab initio-voorspelling, eiwithomologie en RNA-sequencing (RNA-seq) bewijs, evenals handmatige curatie, maar deze laatste stap (handmatige curatie) wordt door veel onderzoekers vaak over het hoofd gezien. Vervolgens richtte onze focus zich op het onderzoeken van de aanpassingen die hielpen hun evolutie vorm te geven, waarbij we gebruik maakten van de inzichten uit de genoomannotatie. In dit proefschrift beschrijven we de gedetailleerde genoomanalyse van acht organismen, zoals de volledige genoomduplicatie, chromosoomherschikkingen, genverliezen en -winsten, uitbreidingen en samentrekkingen van specifieke genfamilies en hun unieke evolutie
  4. Xu, Z., Li, Z., Ren, F., Gao, R., Wang, Z., Zhang, J., … Song, J. (2022). The genome of Corydalis reveals the evolution of benzylisoquinoline alkaloid biosynthesis in Ranunculales. PLANT JOURNAL, 111(1), 217–230. https://doi.org/10.1111/tpj.15788
    Species belonging to the order Ranunculales have attracted much attention because of their phylogenetic position as a sister group to all other eudicot lineages and their ability to produce unique yet diverse benzylisoquinoline alkaloids (BIAs). The Papaveraceae family in Ranunculales is often used as a model system for studying BIA biosynthesis. Here, we report the chromosome-level genome assembly of Corydalis tomentella, a species of Fumarioideae-one of the two subfamilies of Papaveraceae. Based on the comparisons of sequenced Ranunculalean species, we present clear evidence of a shared whole-genome duplication (WGD) event that has occurred before the divergence of Ranunculales but after its divergence from other eudicot lineages. The C. tomentella genome enabled us to integrate isotopic labelling and comparative genomics to reconstruct the BIA biosynthetic pathway for both sanguinarine biosynthesis shared by papaveraceous species and the cavidine biosynthesis specific to Corydalis. Also, our comparative analysis revealed that gene duplications, especially tandem gene duplications, underlie the diversification of BIA biosynthetic pathways in Ranunculales. In particular, tandemly duplicated berberine bridge enzyme-like genes appear to be involved in cavidine biosynthesis. In conclusion, our study of the C. tomentella genome provides important insights into the occurrence of WGDs during the early evolution of eudicots as well as into the evolution of BIA biosynthesis in Ranunculales.
  5. Chang, J., Marczuk‐Rojas, J. P., Waterman, C., Garcia‐Llanos, A., Chen, S., Ma, X., … Carretero‐Paulet, L. (2022). Chromosome‐scale assembly of the Moringa oleifera Lam. genome uncovers polyploid history and evolution of secondary metabolism pathways through tandem duplication. PLANT GENOME, 15(3). https://doi.org/10.1002/tpg2.20238
    The African Orphan Crops Consortium (AOCC) selected the highly nutritious, fast growing and drought tolerant tree crop moringa (Moringa oleifera Lam.) as one of the first of 101 plant species to have its genome sequenced and a first draft assembly was published in 2019. Given the extensive uses and culture of moringa, often referred to as the multipurpose tree, we generated a significantly improved new version of the genome based on long-read sequencing into 14 pseudochromosomes equivalent to n = 14 haploid chromosomes. We leveraged this nearly complete version of the moringa genome to investigate main drivers of gene family and genome evolution that may be at the origin of relevant biological innovations including agronomical favorable traits. Our results reveal that moringa has not undergone any additional whole-genome duplication (WGD) or polyploidy event beyond the gamma WGD shared by all core eudicots. Moringa duplicates retained following that ancient gamma events are also enriched for functions commonly considered as dosage balance sensitive. Furthermore, tandem duplications seem to have played a prominent role in the evolution of specific secondary metabolism pathways including those involved in the biosynthesis of bioactive glucosinolate, flavonoid, and alkaloid compounds as well as of defense response pathways and might, at least partially, explain the outstanding phenotypic plasticity attributed to this species. This study provides a genetic roadmap to guide future breeding programs in moringa, especially those aimed at improving secondary metabolism related traits.
  6. Tien, N. Q. D., Ma, X., Man, L. Q., Chi, D. T. K., Huy, N. X., Nhut, D.-T., … Loc, N. H. (2021). De novo whole-genome assembly and discovery of genes involved in triterpenoid saponin biosynthesis of Vietnamese ginseng (Panax vietnamensis Ha et Grushv.). PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS, 27(10), 2215–2229. https://doi.org/10.1007/s12298-021-01076-1
    Vietnamese ginseng (Panax vietnamensis Ha et Grushv.), also known as Ngoc Linh ginseng, is a high-value herb in Vietnam. Vietnamese ginseng has been proven to be effective in enhancing the immune system, human memory, anti-stress, anti-inflammatory, anti-cancer, and prevent aging. The present study reports the first draft whole-genome of Vietnamese ginseng and the identification of potential genes involved in the triterpenoid metabolic pathway. De novo whole-genome assembly was performed successfully from a data of approximately 139 Gbps of 394,802,120 high quality reads to generate 9815 scaffolds with an N50 value of 572,722 bp from the leaf of Vietnamese ginseng. The assembled genome of Vietnamese ginseng is 3,001,967,204 bp long containing 79,374 gene models. Among them, there are 55,012 genes (69.30%) were annotated by various public molecular biology databases. The potential genes involved in triterpenoid saponin biosynthesis in Vietnamese ginseng and their metabolic pathway were also predicted." Three genes encoding squalene monooxygenase isozymes in Vietnamese ginseng were cloned, sequenced and characterized. Moreover, expression levels of several key genes involved in terpenoid biosynthesis in different parts of Vietnamese ginseng were also analyzed. The SSR markers were detected by various programs from both of assembly full dataset of Vietnamese ginseng genome and predicted genes. The present work provided important data of the draft whole-genome of Vietnamese ginseng for further studies to understand the role of genes involved in ginsenoside biosynthesis and their metabolic pathway at the molecular level of this rare medicinal species.
  7. Ma, X., Olsen, J. L., Reusch, T. B. H., Procaccini, G., Kudrna, D., Williams, M., … Van de Peer, Y. (2021). Improved chromosome-level genome assembly and annotation of the seagrass, Zostera marina (eelgrass). F1000RESEARCH, 10. https://doi.org/10.12688/f1000research.38156.1
    Background: Seagrasses (Alismatales) are the only fully marine angiosperms. Zostera marina (eelgrass) plays a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. It is the most widely studied seagrass and has become a marine model system for exploring adaptation under rapid climate change. The original draft genome (v.1.0) of the seagrass Z. marina (L.) was based on a combination of Illumina mate-pair libraries and fosmid-ends. A total of 25.55 Gb of Illumina and 0.14 Gb of Sanger sequence was obtained representing 47.7× genomic coverage. The assembly resulted in ~2000 unordered scaffolds (L50 of 486 Kb), a final genome assembly size of 203MB, 20,450 protein coding genes and 63% TE content. Here, we present an upgraded chromosome-scale genome assembly and compare v.1.0 and the new v.3.1, reconfirming previous results from Olsen et al. (2016), as well as pointing out new findings. Methods: The same high molecular weight DNA used in the original sequencing of the Finnish clone was used. A high-quality reference genome was assembled with the MECAT assembly pipeline combining PacBio long-read sequencing and Hi-C scaffolding. Results: In total, 75.97 Gb PacBio data was produced. The final assembly comprises six pseudo-chromosomes and 304 unanchored scaffolds with a total length of 260.5Mb and an N50 of 34.6 MB, showing high contiguity and few gaps (~0.5%). 21,483 protein-encoding genes are annotated in this assembly, of which 20,665 (96.2%) obtained at least one functional assignment based on similarity to known proteins. Conclusions: As an important marine angiosperm, the improved Z. marina genome assembly will further assist evolutionary, ecological, and comparative genomics at the chromosome level. The new genome assembly will further our understanding into the structural and physiological adaptations from land to marine life.
  8. Hale, I., Ma, X., Melo, A. T. O., Padi, F. K., Hendre, P. S., Kingan, S. B., … Van Deynze, A. (2021). Genomic resources to guide improvement of the shea tree. FRONTIERS IN PLANT SCIENCE, 12. https://doi.org/10.3389/fpls.2021.720670
    A defining component of agroforestry parklands across Sahelo-Sudanian Africa (SSA), the shea tree (Vitellaria paradoxa) is central to sustaining local livelihoods and the farming environments of rural communities. Despite its economic and cultural value, however, not to mention the ecological roles it plays as a dominant parkland species, shea remains semi-domesticated with virtually no history of systematic genetic improvement. In truth, shea's extended juvenile period makes traditional breeding approaches untenable; but the opportunity for genome-assisted breeding is immense, provided the foundational resources are available. Here we report the development and public release of such resources. Using the FALCON-Phase workflow, 162.6 Gb of long-read PacBio sequence data were assembled into a 658.7 Mbp, chromosome-scale reference genome annotated with 38,505 coding genes. Whole genome duplication (WGD) analysis based on this gene space revealed clear signatures of two ancient WGD events in shea's evolutionary past, one prior to the Astrid-Rosid divergence (116-126 Mya) and the other at the root of the order Ericales (65-90 Mya). In a first genome-wide look at the suite of fatty acid (FA) biosynthesis genes that likely govern stearin content, the primary determinant of shea butter quality, relatively high copy numbers of six key enzymes were found (KASI, KASIII, FATB, FAD2, FAD3, and FAX2), some likely originating in shea's more recent WGD event. To help translate these findings into practical tools for characterization, selection, and genome-wide association studies (GWAS), resequencing data from a shea diversity panel was used to develop a database of more than 3.5 million functionally annotated, physically anchored SNPs. Two smaller, more curated sets of suggested SNPs, one for GWAS (104,211 SNPs) and the other targeting FA biosynthesis genes (90 SNPs), are also presented. With these resources, the hope is to support national programs across the shea belt in the strategic, genome-enabled conservation and long-term improvement of the shea tree for SSA
  9. Wang, X., Chen, S., Ma, X., Yssel, A. E. J., Chaluvadi, S. R., Johnson, M. S., … Van Deynze, A. (2021). Genome sequence and genetic diversity analysis of an under-domesticated orphan crop, white fonio (Digitaria exilis). GIGASCIENCE, 10(3). https://doi.org/10.1093/gigascience/giab013
    Background: Digitaria exilis, white fonio, is a minor but vital crop of West Africa that is valued for its resilience in hot, dry, and low-fertility environments and for the exceptional quality of its grain for human nutrition. Its success is hindered, however, by a low degree of plant breeding and improvement. Findings: We sequenced the fonio genome with long-read SMRT-cell technology, yielding a similar to 761 Mb assembly in 3,329 contigs (N50, 1.73 Mb; L50, 126). The assembly approaches a high level of completion, with a BUSCO score of >99%. The fonio genome was found to be a tetraploid, with most of the genome retained as homoeologous duplications that differ overall by similar to 4.3%, neglecting indels. The 2 genomes within fonio were found to have begun their independent divergence similar to 3.1 million years ago. The repeat content (>49%) is fairly standard for a grass genome of this size, but the ratio of Gypsy to Copia long terminal repeat retrotransposons (similar to 6.7) was found to be exceptionally high. Several genes related to future improvement of the crop were identified including shattering, plant height, and grain size. Analysis of fonio population genetics, primarily in Mali, indicated that the crop has extensive genetic diversity that is largely partitioned across a north-south gradient coinciding with the Sahel and Sudan grassland domains. Conclusions: We provide a high-quality assembly, annotation, and diversity analysis for a vital African crop. The availability of this information should empower future research into further domestication and improvement of fonio.
  10. Ma, X., Vaistij, F. E., Li, Y., Jansen van Rensburg, W. S., Harvey, S., Bairu, M. W., … Denby, K. J. (2021). A chromosome‐level Amaranthus cruentus genome assembly highlights gene family evolution and biosynthetic gene clusters that may underpin the nutritional value of this traditional crop. PLANT JOURNAL, 107(2), 613–628. https://doi.org/10.1111/tpj.15298
    Traditional crops historically provided accessible and affordable nutrition to millions of rural dwellers but have been neglected, with most modern agricultural systems over reliant on a small number of internationally-traded crops. Traditional crops are typically well-adapted to local agro-ecological conditions and many are nutrient-dense. They can play a vital role in local food systems through enhanced nutrition (especially where diets are dominated by starch crops), food security and livelihoods for smallholder farmers, and a climate-resilient and biodiverse agriculture. Using short-read, long-read and phased sequencing technologies we generated a high-quality chromosome-level genome assembly for Amaranthus cruentus, an under-researched crop with micronutrient- and protein-rich leaves and gluten-free seed, but lacking improved varieties, with respect to productivity and quality traits. The 370.9 MB genome demonstrates a shared whole genome duplication with a related species, Amaranthus hypochondriacus. Comparative genome analysis indicates chromosomal loss and fusion events following genome duplication that are common to both species, as well as fission of chromosome 2 in A. cruentus alone, giving rise to a haploid chromosome number of 17 (versus 16 in A. hypochondriacus). Genomic features potentially underlying the nutritional value of this crop include two A. cruentus-specific genes with a likely role in phytic acid synthesis (an anti-nutrient), expansion of ion transporter gene families, and identification of biosynthetic gene clusters conserved within the amaranth lineage. The A. cruentus genome assembly will underpin much-needed research and global breeding efforts to develop improved varieties for economically viable cultivation and realisation of the benefits to global nutrition security and agrobiodiversity.