From asymmetrical to balanced genomic diversification during rediploidization: Subgenomic evolution in allotetraploid fish

See allHide authors and affiliations

Science Advances  27 May 2020:
Vol. 6, no. 22, eaaz7677
DOI: 10.1126/sciadv.aaz7677
  • Fig. 1 Workflow of genome assembly and subgenome identification.

    (A) Genome size and karyotype of goldfish. (a) Image of a gynogenetic goldfish (C. auratus var.). Photo credit: Shaojun Liu, Hunan Normal University, China. (b) Diagram of C value. The X axis presents the fluorescence index, and the Y axis presents the frequency of cells. Sample/calibration ratio equals the peak X value of the calibration sample divided by X value at the peak of the target sample. The first sharp peak with green dashed line displays the X axis and cell frequency of chicken blood, and the second one with red dashed line represents the X axis and cell frequencies of goldfish. C value of sample is sample/calibration ratio × calibration sample’s C value. (c) Goldfish have 100 chromosomes and 100 signals after the chromosomes are stained with DNA probe (probe A) [9468-bp fragment of 36 copies of a repetitive 263-bp fragment; adopted from Liu et al. (11)]. (B) Sequencing technologies for primary assembly. (C) Genome assembly, Hi-C cluster, and genetic map construction. Genome size assessment by k-mer analysis is performed by 40× Illumina paired-end reads after the primary assembly. Next, scaffolds are clustered into 50 pseudochromosomes by using Hi-C data obtained by chromosomes; the genetic map was constructed by using the data of Kuang et al. (27) (D) Annotation and chromosome-scale organization. Annotation of scaffolds was performed using a combination of ab initio prediction, transcript evidence gathered from RNA-seq of embryos and eight kinds of adult tissues (gonads, brain, liver, spleen, kidney, eye, epithelium, and fin), and homologous genes information from five fish genomes, by using EVidence Modeler (EVM). Final set of 50 pseudochromosomes was generated after pairwise validation among Hi-C clustering results, genetic map, and collinearity analyses. (E) Subgenome identification. After extracting the homologous genes of goldfish and other species, the species tree is constructed by using single-copy genes from 10 genomes. Gene trees were constructed by defining homologous gene clusters using whole-genome sequences/transcripts from 10 cyprinid species of Cyprininae (C. auratus, Cyprinus carpio), Labeoninae (Labeo rohita), Poropuntiinae (Poropuntius huangchuchieni), Schizothoracinae (Schizothorax oconnori, Schizothorax waltoni, Schizothorax macropogon, and Schizothorax kozlovi), Danio rerio, and Ctenopharyngodon idellus. After comparing the species tree and nucleic gene trees, the matrilineal (clustered with Schizothorax) and patrilineal markers from the gene trees were labeled back to 25 pairs of pseudochromosomes. The origin of pseudo-chromosomes was identified by most of the supported markers.

  • Fig. 2 Syntenic blocks, phylogeny, subgenome identification, and allotetraploidy.

    (A) Diagrams displaying the rearrangement identified in the syntenic blocks between our assembly and the goldfish genome published by Chen et al. (26). The syntenic block between two assemblies was identified with an inversion event (a), in which the rearranged boundary on our genome was continuously covered by the long reads of the optical map (b), and supported by the strong clustering signal from the heat map (c) constructed by high-depth sequencing data from chromosome conformation capture technology. (B) Allotetraploid goldfish and common carp genome synteny. Blocks represent the genomic overview of assembled chromosomes of subgenome M and subgenome P from goldfish and subgenome B and subgenome A from common carp (Hebao red carp) (29). Colored lines indicate the orthologous sites of gene blocks and their colinear relationships between subgenomes M and B and subgenomes P and A, respectively. Numbers M01 to M25, P01 to P25, B01 to B25, and A01 to A25 indicate the homologous chromosomes of M and P subgenomes from goldfish, and B and A subgenomes from common carp, respectively, numbering in order according to the collinearity relationships to zebrafish genome (fig. S3). The numbers beside M and P chromosomes indicate the supporting rate from homoeologous gene makers with clear origin of parental ancestor determined by the gene trees. (C) Phylogenetic relationships and timing of WGD/polyploidization events in Cyprininae, with nodes based on protein-coding genes of goldfish, common carp, golden-line barbel, grass carp, and zebrafish. Dated divergence time of grass carp and the ancestor of Cyprininae was 20.9 Ma ago, and the putative matrilineal and patrilineal progenitors were 15.1 Ma ago (T1), after the WGD event (T2). Divergence of the polyploid Cyprininae radiation was dated at 13.8 Ma ago (T3), and the divergence between goldfish and common carp was dated at 10.0 Ma ago (T4). Orange and blue branches indicate the putative M and P progenitors, respectively. Photo credit: Goldfish and common carp by Shaojun Liu, Hunan Normal University, China; golden-line barbel and grass carp from FishBase; zebrafish from Wikimedia. (D) Phylogenetic tree based on protein-coding genes from single-copy orthologs, rooted with human and chicken. Alignments were performed by MUSCLE, and the maximum likelihood tree was reconstructed by PhyML.

  • Fig. 3 Syntenic analysis and symmetrical evolution of the allopolyploid goldfish genome.

    (A) Syntenic blocks between goldfish and common carp and syntenic blocks between goldfish and zebrafish with color-labeled rearrangements in two examples. The other 23 groups of syntenic analysis are shown in fig. S3. All 50 pseudochromosomes of goldfish show clear two-to-one orthologous relationships to the 25 chromosomes of zebrafish. Orange bars and numbers mark the chromosomes from matrilineal (M and B) subgenomes, while blue denotes the patrilineal (P and A) ones. Gray bars and numbers mark the chromosomes of zebrafish. Light gray lines indicate the syntenic blocks, light orange lines indicate the rearrangements between goldfish and common carp, and the pink lines indicate the rearrangements between goldfish and zebrafish. The red and black bars on each chromosome indicate the biased repeat densities, in which the red ones indicate the syntenies with significantly higher repeat densities relative to the black ones (P < 0.05, Wilcoxon rank sum test). (B) Boxplots of distributions of (a) GC content and (b) repeat density in consistent syntenies, rearranged regions, and the boundaries of rearranged regions. Orange and blue boxes mark the values from matrilineal (M and B) and patrilineal (P and A) subgenomes, respectively. Boxes with black lines indicate the distributions of goldfish, and the ones with dashed lines indicate the common carp. (C) Distributions of consecutive gene retentions in subgenomes M and P, which do not differ from each other significantly (Fisher’s exact test, P = 0.35). (D) Dating the time of pseudogene formation in subgenomes M and P of goldfish, subgenomes B and A of common carp, as described in Supplementary Methods and Analysis 6. X axis represents an estimated time of pseudogene formation; Y axis represents the frequencies of pseudogenes in every 0.5-Ma unit. Orange lines display the frequencies of pseudogenes on subgenomes M or B along time (X axis), and blue lines display the frequencies of pseudogenes on subgenomes P or A. Gray bars indicate the timing of allopolyploidization (13.8 to 15.1 Ma ago). (a) and (b) display the distributions of all pseudogenes frequencies from each subgenome along time in goldfish and common carp; (c) and (d) display pseudogenization events specific to each subgenome in goldfish and common carp. (E) Boxplot displays the distributions of Ka/Ks ratios for (a) 7568 homoeologous gene pairs between M and P subgenomes calculated against zebrafish and grass carp and (b) distributions of Ka/Ks ratios displayed in boxplot for (a) 7568 homoeologous gene pairs between subgenomes B and A calculated against zebrafish and grass carp. Central line in each boxplot indicates the median value. Top and bottom edges of the box indicate the 25th and 75th percentiles, and the dashed lines extend 1.5 times the interquartile range beyond the edges of the box.

  • Fig. 4 Spatiotemporal expression patterns of homoeologous gene pairs and DNA methylation levels.

    (A) Boxplot of log10(TPMM/TPMP) for homoeologous gene pairs showing medians in six adult tissues of goldfish. Red dashed line shows the equal ratio of log10(1). All adult tissues show no bias of expression between genes stemming from the subgenome M or P. (B) Boxplot of log10 [(TPMM + 0.1)/(TPMP + 0.1)] for homoeologous gene pairs showing medians in 15 developmental stages of goldfish [16 cell, 32 cell, 64 cell, 128 cell, 1000 cell; 8, 12, 16, 18, 24, 30, 46, 64, 71, and 84 hours post-fertilization (hpf)]. Red dash shows the equal ratio of log10(1). Time points from 64-cell to 22-somite stages biased expression of M homoeologs, which average 4.8% more than P genes within homoeologous gene pairs; early embryos (16- and 32-cell stages), pharyngula, and hatching period embryos show no bias of expression. (C) Boxplot of log10 [(TPMB + 0.1)/(TPMA + 0.1)] for homoeologous gene pairs from common carp shows medians in zygotically controlled developmental time points. Red dashed line shows equal ratios of log10(1). Time points from 32-cell to germ ring, 25% otic vesicle closure (OVC), long pec, and pec fin stages indicate biased B-homoeolog expression; other stages show no expression bias. (D) Expression patterns of 9090 homoeologous gene pairs from goldfish where the trend displays expression of either biased toward M or P homoeologs (EBM or EBP; 6223 genes in total, 68.46%) when gene pairs are coexpressed in at least one development stage. (E) Expression trend of 9090 homoeologous gene pairs from goldfish displaying an expression shift between two homoeologs (ES; 1644 genes in total, 18.09%) where one copy is expressed higher than the other at earlier time points, then the other copy surpasses it in later development stages. (F) Expression patterns of 4241 homoeologous gene pairs from common carp where the pattern displays either biased B- or A-homoeolog expression (2811 gene in total, 66.28%) when homoeologous gene pairs coexpressed in at least one development stage. (G) Expression patterns of 4241 homoeologous gene pairs that indicate expressional shift between two homoeologous gene pairs (414 gene in total, 9.76%); one homoeologous gene copy expressed higher than the other at an earlier time point, then the other copy surpasses it later in development. Patterns include two groups: genes (185, 4.36%) with ES before germ ring stage and genes (229, 5.40%) with ES post-germ ring stage, which accounts for most genes. Among the ES genes, 32 (0.75%) have more than two time shifts. (H) Comparison of DNA methylation levels between the two subgenomes in brain and liver tissues. (I) Comparison of DNA methylation levels between the two subgenomes in embryos of 12 developmental stages.

  • Table 1 The statistics of assembly from three versions of goldfish in this paper, goldfish by Chen et al. (26), and common carp by Xu et al. (29).

    NGS, next-generation sequencing; LG, linkage group.

    V. 1V. 2V. 3Goldfish Chen et al. 2019Hebao red carp
    Xu et al. 2019
    NGSPacBio + optical mapPacBio + optical
    map + Hi-C
    Total contig size (Gb)1.221.541.491.851.41
    Contig no.636,6706,1444,4338,463355,804
    Longest contig (bp)74,0237,661,6737,650,526207,110
    Contig N50 (Mb)0.0041.111.160.820.02
    Contig L5037535351319,142
    Total scaffold size (Gb)1.461.641.591.821.46
    Scaffold/chr no.358,7825,477506,216262,449
    Longest scaffold (kb)5,76211,07065,9056,571
    Scaffold N50 (Mb)0.462.9434.790.92
    Scaffold L5016018393
    Scaffold N90 (kb)1.2496.0624,821.2422.24
    Scaffold L901,452403,290
    Anchor ratio on chr/LG97%/65%66%82%
    Gap length on chr/LG (Mb)900.2243
  • Table 2 The numbers and ratios of homoeologous gene pairs with each expression patterns in common carp.

    EBM, expression bias toward matrilineal (M and B) homoeologs; EBP, expression bias toward patrilineal (P and A) homoeologs; ES, expression shift between two homoeologs.

    GoldfishCommon carp
    Trend 1N%N%N%N%
    Trend 2N%N%
      ES total no.164418.094149.76
      Group 1:
    ES before germ ring stage
      Group 2:
    ES after germ ring stage

Supplementary Materials

  • Supplementary Materials

    From asymmetrical to balanced genomic diversification during rediploidization: Subgenomic evolution in allotetraploid fish

    Jing Luo, Jing Chai, Yanling Wen, Min Tao, Guoliang Lin, Xiaochuan Liu, Li Ren, Zeyu Chen, Shigang Wu, Shengnan Li, Yude Wang, Qinbo Qin, Shi Wang, Yun Gao, Feng Huang, Lu Wang, Cheng Ai, Xiaobo Wang, Lianwei Li, Chengxi Ye, Huimin Yang, Mi Luo, Jie Chen, Hong Hu, Liujiao Yuan, Li Zhong, Jing Wang, Jian Xu, Zhenglin Du, Zhanshan (Sam) Ma, Robert W. Murphy, Axel Meyer, Jianfang Gui, Peng Xu, Jue Ruan, Z. Jeffrey Chen, Shaojun Liu, Xuemei Lu, Yaping Zhang

    Download Supplement

    The PDF file includes:

    • Supplementary Methods and Analysis
    • Figs. S1 to S8
    • Table S1
    • Legends for data S1 to S4
    • References

    Other Supplementary Material for this manuscript includes the following:

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article