Research ArticleGENETIC DIVERSITY

Population genomics of picophytoplankton unveils novel chromosome hypervariability

See allHide authors and affiliations

Science Advances  05 Jul 2017:
Vol. 3, no. 7, e1700239
DOI: 10.1126/sciadv.1700239
  • Fig. 1 Polymorphism features on 18 standard chromosomes.

    (A) Site frequency spectrum of total SNPs. (B) Number of observed and expected mutation events involved in MNPs. (C) Distribution of fitness effects on nonsynonymous sites. (D) LD between pairs of SNPs, measured as r2, with distance on chromosome 10. Gray line, expected LD for unlinked sites estimated from site frequency spectrum on intergenic sites.

  • Fig. 2 Variation of the population recombination rate (ρ), nucleotide diversity (π), and GC content (GC) along chromosomes of O. tauri.

    (A) Chromosome 1, a standard chromosome. (B) Chromosome 2. ρ, population-scaled recombination rate (per kilobase) inferred from SNP using the program interval of LDhat package; π, nucleotide diversity per site averaged across 2.5-kb windows; GC, averaged GC percent across 10-kb windows; NCR, noncallable regions of at least 1 kb, where polymorphisms could not be called because of insufficient coverage (<10×) or because of too high coverage suggestive of regions in multiple copies, which are not represented in the reference sequence. ρ and π are computed using SNPs segregating among the 13 strains for chromosome 1 and 12 strains (RCC4221 is excluded) for chromosome 2. For chromosome 2, the reference strain RCC4221 was excluded because of its high divergence with the other strains on the candidate MAT locus region; the sequence of chromosome 2 of strain RCC1115 was used instead.

  • Fig. 3 Genetic structure of the two candidate MAT loci on chromosome 2 in O. tauri.

    (A) Genes are indicated by vertical lines. Blue lines, orthologous genes between plus and minus type strains; green lines, plus specific genes; orange lines, minus specific genes; gray lines, orthologous relationship between genes on different strains. “*” indicates the position of the five orthologous genes used to build the phylogeny from the different strains. (B) Phylogeny of five orthologous housekeeping genes trapped inside the MAT locus region in seven species of Mamiellophyceae (O. tauri, Ostreococcus mediterraneus, Ostreococcus RCC809, Ostreococcus lucimarinus, Bathycoccus prasinos RCC1105, Micromonas pusilla RCC299, and M. pusilla CCMP1545). Bootstrap maximum likelihood (ML) percentages are indicated on nodes.

  • Fig. 4 Chromosome 19 sequence conservation in six O. tauri strains.

    (A) Large synteny blocks along chromosome 19 are represented as rectangles (fig. S6), and red lines indicate location of strain-specific DNA. Pairwise local alignments between strains are represented in gray (sense) or blue (antisense) (blastn with scores between 50 and 500 and percentage alignment identities >95%). (B) Proportion of DNA in chromosome in syntenic regions, in rearranged regions, and in strain-specific regions. (C) Taxonomic affiliation of strain-specific sequences using sequence homology against GenBank (tblastx) (69).

  • Fig. 5 Length of chromosome 19 versus susceptibility to prasinoviruses.

    Susceptibility of each strain is estimated by the percent of prasinoviruses lysing a strain and varies from 0% (strain is resistant to all 32 viruses) to 100% (strain is lysed by all 32 viruses). Pearson correlation coefficient ρ = −0.74, P < 0.005.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/3/7/e1700239/DC1

    fig. S1. Read coverage map of reference strain RCC4221 per chromosome for each strain.

    fig. S2. Size distribution of chromosomes 2 and 19 in 13 O. tauri strains.

    fig. S3. Phylogeny of the 2.5-kb repeat of chromosome 19’s syntenic block A.

    fig. S4. Growth rates of 13 strains with 95% confidence interval at 14° and 20°C.

    fig. S5. Sampling area and sampling sites of water samples used to isolate the 13 strains of O. tauri.

    fig. S6. Dotplot of chromosome 19 between RCC1115 and RCC1559, RCC4221, RCC1108, and RCC1123.

    fig. S7. Venn diagram depicting the number of shared or specific polymorphic sites called by each calling method.

    fig. S8. Large (>100 bp) indel read mapping features.

    fig. S9. Phylogenetic distance tree of 13 O. tauri strains based on 117,600 biallelic SNPs.

    table S1. Read mapping and polymorphism statistics for each strain.

    table S2. Nucleotide diversity and number of segregating and nonsegregating sites for five types of polymorphisms (synonymous, fourfold degenerate, nonsynonymous, intergenic, and intronic).

    table S3. Fitting the models to the site frequency spectrum on fourfold degenerate sites.

    table S4. Analysis of gene content of predicted and validated insertions using sequence homology (blastx).

    table S5. Functional description of 10 predicted and validated deletions.

    table S6. Viral susceptibility and resistance spectrum of O. tauri strains and a collection of 32 prasinoviruses.

    table S7. Geographic origin and sampling date of the 13 O. tauri strains.

    table S8. Variants, SNPs, and indel polymorphisms for each calling method.

    table S9. Features of the four resequenced regions.

    table S10. SNP calling validation for GATK UG, GENO, and HC by Sanger resequencing.

    table S11. Insertion detection steps.

    table S12. Insertion assembly steps.

    table S13. Deletion detection steps.

    table S14. Position and validation status of 18 predicted insertions.

    table S15. Position and validation status of 12 predicted deletions (>500 bp).

    table S16. Pairwise growth rates by Tukey test between strains at 20° and 14°C.

    References (7074)

  • Supplementary Materials

    This PDF file includes:

    • fig. S1. Read coverage map of reference strain RCC4221 per chromosome for each strain.
    • fig. S2. Size distribution of chromosomes 2 and 19 in 13 O. tauri strains.
    • fig. S3. Phylogeny of the 2.5-kb repeat of chromosome 19’s syntenic block A.
    • fig. S4. Growth rates of 13 strains with 95% confidence interval at 14° and 20°C.
    • fig. S5. Sampling area and sampling sites of water samples used to isolate the 13 strains of O. tauri.
    • fig. S6. Dotplot of chromosome 19 between RCC1115 and RCC1559, RCC4221, RCC1108, and RCC1123.
    • fig. S7. Venn diagram depicting the number of shared or specific polymorphic sites called by each calling method.
    • fig. S8. Large (>100 bp) indel read mapping features.
    • fig. S9. Phylogenetic distance tree of 13 O. tauri strains based on 117,600 biallelic SNPs.
    • table S1. Read mapping and polymorphism statistics for each strain.
    • table S2. Nucleotide diversity and number of segregating and nonsegregating sites for five types of polymorphisms (synonymous, fourfold degenerate, nonsynonymous, intergenic, and intronic).
    • table S3. Fitting the models to the site frequency spectrum on fourfold degenerate sites.
    • table S4. Analysis of gene content of predicted and validated insertions using sequence homology (blastx).
    • table S5. Functional description of 10 predicted and validated deletions.
    • table S6. Viral susceptibility and resistance spectrum of O. tauri strains and a collection of 32 prasinoviruses.
    • table S7. Geographic origin and sampling date of the 13 O. tauri strains.
    • table S8. Variants, SNPs, and indel polymorphisms for each calling method.
    • table S9. Features of the four resequenced regions.
    • table S10. SNP calling validation for GATK UG, GENO, and HC by Sanger resequencing.
    • table S11. Insertion detection steps.
    • table S12. Insertion assembly steps.
    • table S13. Deletion detection steps.
    • table S14. Position and validation status of 18 predicted insertions.
    • table S15. Position and validation status of 12 predicted deletions (>500 bp).
    • table S16. Pairwise growth rates by Tukey test between strains at 20° and 14°C.
    • References (70–74)

    Download PDF

    Files in this Data Supplement:

Navigate This Article