Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 2 - GUPPYSEX (Evolutionary genetics of guppy sex chromosomes)

Teaser

The question addressed: the hypothesis that sexually antagonistic polymorphisms on the guppy sex chromosomes have selected for reduced genetic recombination between the two members of this chromosome pair, and whether such selection is causing ongoing changes in...

Summary

The question addressed: the hypothesis that sexually antagonistic polymorphisms on the guppy sex chromosomes have selected for reduced genetic recombination between the two members of this chromosome pair, and whether such selection is causing ongoing changes in recombination.

The hypothesis to be tested is widely believed to be true, but empirical data have largely been lacking, as this is a difficult hypothesis to test. The project aims to contribute such data in a species that is often cited as supporting the hypothesis. In doing the work, it is contributing to development of approaches for detecting regions of sex linkage in genomes, generating the first detailed genetic map of the study species, and providing tests of high-throughput genotyping approaches for use in a non-model organism, with the potential to aid studies of other such organisms in which questions of biological interest have been beyond the reach of existing approaches.

Specific objectives, as in the original project.
1) Ascertain and sequence X- and Y-linked genes in guppies.
2) Establish a dense genetic map of the sex chromosome pair.
3) Estimate the age of the guppy sex chromosome system is (in Poecilia reticulata), using X-Y gene pairs.
4) Use population genetic data to distinguish between fully sex-linked genes and PAR genes closely linked to the fully sex-linked region, and to test whether there is ongoing evolution of the recombination rate between PAR genes with the MSY within P. reticulata.
5) Study the genetic control of male coloration phenotypes, including mapping genes to the sex chromosomes, including the pseudo-autosomal region (PAR), and to the autosomes, distinguishing fully Y-linked genes from PAR genes, and estimate allele frequencies in natural populations with low and high predation rates.
6) Ascertain X-linked genes and estimate the proportion that have retained Y-linked copies, to estimate the extent of genetic degeneration.

Work performed

Scientific Report
The report below outlines the experiments and analysis so far conducted, and their results, roughly in the order of the objectives of the proposal. As explained in detail below, we have made substantial progress on objectives 1, 2, 4 and 5.

Objective 1: identifying the sex-linked region of the Poecilia reticulata (guppy) genome, and ascertaining and sequencing X- and Y-linked genes.
Based on previous published work on the species, the fully sex-linked region was expected to be on chromosome 12 (the guppy has 23 acrocentric chromosome pairs). The Edinburgh group therefore first searched for male-specific sequence variants in a candidate region where a male-specific heterochromatin has been identified cytologically, near the tip of the P. reticulata assembly of chromosome 12. We used primers based on the complete genome sequence of a female individual from the Guanapo river, Caroni drainage, Trinidad (1). PCR amplification of sequences located in this region used a sample of 10 males and 6 females from a captive population derived from a high-predation population collected from the Aripo river (also in the Caroni drainage) and maintained in a large population by project COIs (Wilson, Croft) at the University of Exeter. High-predation populations are expected to have high genetic diversity (based on previously published studies on Trinidadian guppies), and should provide abundant markers for genetic mapping (objective 2 of the project). The sex chromosomes of such populations are thought to have less recombination than those found in up-river sites with lower predation, maximizing the chances of detecting sex linkage. The sexes of all the fish were ascertained at maturity.

Unexpectedly, these PCR experiments detected no variants with genotypes suggesting complete sex linkage in any chromosome 12 genes. This suggests that fully sex-linked variants may be restricted to a small genome region. Our second aim was to estimate a genetic map of the sex chromosome pair using densely spaced genetic markers. To obtain markers for this goal, as well as to test the hypothesis that the guppy fully sex-linked region might be very small, we obtained paired end high coverage genome sequences from the same set of males and females as described above. The sequencing and initial bioinformatic analysis, including mapping the reads to the published female guppy reference genome mentioned above, and generating VCF files with variants that passed stringent controls on quality and depth of coverage, were done by Edinburgh Genomics. The mean insert size of the sequences was 468 bp, and they at least 97% mapped to the reference genome in all 16 individuals. Coverage is high for most sites (not just on the basis of the genome-wide mean, which can obscure the presence of low-coverage sites); the lower 99th percentile values averaged 13, and exceeded 9 for all individuals. Overall, therefore, the aim of obtaining high coverage results was achieved, yielding sequence data for a sample of 22 X and 10 Y chromosomes.
(i) The guppy Y chromosome is not genetically degenerated (objective 6), and may have evolved recently: Almost all sites in chromosome 12 sequences had similar coverage in both sexes, similar to the genome-wide coverage value. Two alleles are therefore present in both sexes for most sequences on the XY pair, implying that sequences have not been lost in the fully Y-linked region in a process of genetic degeneration like that in old-established sex chromosomes.
Objective 3, estimating the age of the guppy fully sex-linked region using divergence between X- and Y-linked sequences, has not yet been achieved, because (as described below), we have not been able to identify any extensive fully sex-linked region.

Our sequencing yielded large numbers of single nucleotide polymorphisms (SNPs), providing excellent genetic markers, as well as other sequence variants (insertions and deletions). The SNPs were used for genetic mapping a

Final results

Objective 2: Genetic mapping
Currently, we have genetic mapping results from only one low-predation population, because these fish mature slowly, but further families have been generated and the fish have either reached maturity, or will soon do so. Families from several more populations, including both high and low predation sites, are in hand and will be used in further genetic mapping. These will fulfil the objective of comparing genetic maps in male meiosis between populations with different predation levels (to test for the expected greater recombination rate in low- than high-predation populations).

With the current total number of progeny so far genotyped (165), the recombination rate estimate based on observing no recombinants in regions proximal to 24.5 Mb in the pooled progeny from the male parents of all families is 0.4 cM. In the next months, we will increase the progeny numbers, to better estimate this rate. The current upper 99% confidence interval is 2.8 cM, not inconsistent with the cytological estimate of 5% recombinants. It is therefore possible that some recombination occurs in parts of chromosome 12 some distance from its tip, in other words that the guppy XY pair has two PARs, one at the tip, and another one more proximal that rarely undergoes crossover events. A crossover rate of 5% in such a PAR2 could potentially explain the published results for male coloration factors, which suggest recombination rates of at most 10% with the male-determining gene. If so, this suggests that some of these factors are in genes in proximal locations.

We will also genetically map several chromosomes other than 12 in further families, to obtain detailed assessments of where recombination events occur on these chromosomes, and to add markers physically close to both ends.
We will also define the PAR in more detail. The marker at 25.3 Mb will be mapped in all families in which the male parents have informative genotypes (if not, other markers in this region will be developed and mapped). Further markers between this and 26 Mb will also be mapped. Our current results are based on comparing the overall total map lengths in the LAH family, which had the largest number of mapped markers, including markers located near each end of the chromosome 12 physical assembly. For this chromosome, the value in male meiosis was 65% of the value in female meiosis. Two autosomes (9 and 18) yielded data for female meiosis, but we have not yet succeeded in mapping terminal markers in the female parents of two of them, while no terminal marker has yet been mapped in male meiosis for chromosome 1. If we assume that the female map lengths are 50 cM, the male values for chromosomes 9 and 18 (Table 2) would be 80 and 84%, respectively, of the female lengths. We will map more markers to test further whether there is indeed a significantly larger reduction in length in males than female meiosis.

Objective 4: Population genomic analyses
In addition to the further genetic mapping outlined above, we plan to supplement these analyses by phasing the variants in sequences from males, to assign them to the X or Y haplotypes. We will test several different software packages that are currently available. Specifically, we plan to test approaches that take account of SNPs within individual reads (called “read-aware” approaches), rather than methods, such as Beagle, designed for human genome sequences. The human genome has very low diversity, so that reads rarely include more than a single SNP. Our sequences, however, indicate high diversity (even in the captive population so far studied), with SNP densities on all chromosomes of around 20 per kilobase, and somewhat higher on chromosome 12 (as expected if this chromosome includes some male-specific or male-associated variants). Initially, we plan to test two read-aware” approaches, implemented in the Hap-Cut and SHAPEIT2 programs, which can run such analyses in relatively short times, and have low ph