BIOL0034 Model Organisms PDF

Title BIOL0034 Model Organisms
Author Hannah Macpherson
Course Applications in Human Genetics
Institution University College London
Pages 7
File Size 227.1 KB
File Type PDF
Total Downloads 54
Total Views 150

Summary

Lecture notes...


Description

Model organisms eQTL: locus that explains a fraction of the genetic variance of a gene expression phenotype Survey of models - Mouse/rat/zebrafish 25-26k genes, all for human disease models knockouts and either genetic mapping (mouse/rat) or development (zebrafish) - Rat is physically larger and more intelligent than mouse - human genome is around 3,000 cM but zebrafish genome is about half the size because humans have centromeres in the middle - c elegans 23k, development, apoptosis, neural maps, RNAi, eutelic (number of fxied somatic cells). wormbase.org Sydney brenner - drosophila melanogaster 15.6k genes, genetic maps, crossing over, development, gene function, behaviour. flybase.org - dictyostelium, slime mould (amoeba), 12.5k genes, cell differentiation, chemotaxis, apoptosis, haploid. dictybase.org - saccharomyces cerevisiae - budding yeast. 5.8k genes, diploid/haploid, yeastgenome.org - schizosaccharomyces pombe, fission yeart, 5k genes, cell cycle, RNAi, haploid/diploid. pombase.org - arabadopsis, thale cress, 25k, standard plant model, arabadopsis.org - Zea mais - maize. 40k, transposons, maizegdb.org - also bacteria eg e.coli, viruses, neurospora crassa (unicellular fungus), farm animals Ageing models - saccharomycles cerevisiae (fission yeast) - cheap, best genetics - not an animal - do they age like animals do? - caenorhabditis elegans (nematode). Cheap, shows ageing, 2-3 week lifespan, no inbreeding effects. Not a mammal - Drosophila melanogaster. Longer lived than c elegans (6-8 weeks), show inbreeding effects, more expensive than c elegans. Organ complexity and genes closer to mammals - Mus musculus (house mouse). Mammal, good genetics, 2-3 year lifespan, shows inbreeding effects. Expensive to do lifespan tests. 2) inbred strains, crosses and recombinant inbred lines inbred strains - useful characteristic of model organisms - almost 100% homozygous and therefore breed true - unlimited genetically identical individuals can be tested. Inbred strains are available from stock centres - haploid organisms (yeast, slime moulds) are effectively inbred - typically one inbred strain is designated as the reference - in mice, C57BL/6J, in worms N2, in rats BN, in Arabidopsis Col-0 - reference strain is usually corresponding to the genome sequence of the model organism - reference human genome differs from that of any individual - in most models, non-reference inbred strains are also available - many strains have been re-sequenced so that a near-complete catalogue of segregating genetic variation is available - ancestry of inbred strains varies from species to species - inbred mouse strains: genetic compositions of mouse strains are historical artefacts of breeding and selection for docility etc and unlike wild mice (which are outbred) - inbred arabidopsis strains (aka accessions) are close to those collected in the wild (ecotypes). Experiments on different accessions reveal effects of population genetics and natural selection

Intercrosses - the main ones you’d be interested in would be either - generating recombinant inbred lines (crossing inbred strains, cross the F1 progeny together then cross the F2 progeny - generating heterozygous dudes (i don’t know why you’d want to do this) by crossing an F1 heterozygote with one of the original inbred strains. End up with only a section of heterozygous genome - Selfing: C. elegans, Arabidopsis - inbreed through repeated selfing - start from a heterozygous individual - fraction of heterozygous genome is 1/2^N after N generations - 1/2^6 = 0.0156 so ~6 generations is sufficient for 98% homozygosity. - results in homozygous mosaics of two founder chromosomes - recombinant inbred lines (RILs) Making recombinant inbred lines: brother-sister mating - most animals and plants can’t self-fertilise but can be inbred by repeated brother sister mating - slower since up to 4 alleles segregate at each locus - fraction of heterozygotes ~0.809017^N after N generations - 0.809017^20 = 0.014 so therefore 20 generations is theoretically sufficient for >98% homozygosity - in practice, most outcrossing species don’t inbreed easily. Many lines die out (fertility decreases with inbreeding) and lethal homozygote loci must be maintained as heterozygotes, eg mouse agouti locus Recombinant inbred lines (RILs) - panels of RILs exist for many model organisms (mice, rats, worms, flies, plants) - genotyping costs minimal as each RIL is only genotyped once - many panels of RILs are available from Stock Centres - different researchers use the same RILs to answer different questions and share results in a common database eg BXD panel of mouse RILs, data collated by GeneNetwork.org - prevision and utility of genetic mapping in RILs is broadly similar to that in F2s, except dominance is undetectable as genotypes are homozygotes - Amount of recombinance in RILs is around 1 or 2 recombinants per chromosome, so mapping resolution isn’t great, but there’s a great amount of power to detect eQTLs. Good way to find approx location of genome where genes are causing a particular trait, but lousy for identifying the particular gene 3) Genetic mapping Genetic mapping with inbred strains - A major use of model organisms is for genetic mapping - typical experimental designs include - comparison between existing inbred strains (GWAS) - crosses between inbred strains - recombinant inbred lines Statistical revision for QTL mapping

- Yi index of individual or strain (could take average of multiple) - µ1 is underlying mean, could contain mice with different backgrounds or environment - chances of detecting something depends on how associated to the SNP it is and the amount of variance explained by the variant

- if the individuals are unequally related then the p-values for association with each SNP will tend to be inflated.

-a Mixed Model can be used to correct for inflation:

Example: Mapping QTLs in natural accessions of arabidopsis - naturally occuring accessions of arabidopsis - are inbred and highly variable (1SNP per 200bp) - have extensive population structure - are under strong natural selection - ~200 inbred naturally occuring accessions of arabidopsis collected worldwide were phenotyped for 100ish traits - acessions were genotyped at 250k SNPs - GWAS performed to control for unequal relatedness - high mapping resolution due to historical recombination - false positive associations controlled for using mixed models (eg Atwell et al., 2010, Nature) - correcting for population structure removes false positives - arabidopsis has 5 chromosomes and also has a particularly high mapping resolution (eg mapping to a region with only 1/2 genes) QTL mapping using intercrosses and RILs - relatively little recombination occurs during: - 1 generation (approx one crossover per chromosome arm) - inbreeding to make a RIL - genotyping with ~200 SNPs that segregate between the two founders is sufficient to map QTLs - map QTLs by linear regression on SNP dosages:

- an improvement in power is obtained by interval mapping. This entails imputing what the genotypes would have been between the genotyped SNPs

- mixed model unnecessary since progeny QTLs are localised to within about 1/3 of a chromosome

- high power to detect QTLs as allele frequencies ~0.5 4) populations derived from multiple inbred strains - ie recombinant inbred strains are from 2 inbred strains mixed, this is from multiple - in contrast to populations descended from two founder strains, descent from multiple inbred strains can reveal new haplotype-based phenomena QTL mapping

5) QTL mapping in mice Origins of inbred strains of mice - mice probably originated in india - house mouse probably moved when humans migrated out of india - 3 main subspecies of mice - Keane et al., 2011 - looked at the number of SNP differences of strain against reference genome. Most strains are 4 million SNPs different to reference strain - similar to humans. PRDM9 - the mouse subspecies - mus musculus domesticus (Western Europe eg C57BL/6J) - mus musculus musculus (Eastern europe and asia eg PWD) - diverged 350,000- 500,000 years ago - PWD x with B6 males have sterile male offspring- males from the reciprocal cross and all female hybrids are fertile - hybrid sterility depends on interactions between the DNA binding zinc finger Prdm9 on Chr17, Hstx1 on chrX and unknown loci - different subspecies of mice carry different Prdm9 alleles - Prdm9 defines recombination hotspots in humans and mice - different Prdm9 alleles recognise different DNA binding sites and in turn define different hotspots - consequently, hotspot locations differ between subspecies Mouse populations derived from multiple inbred strains

- funnel mating - starting with multiple inbred lines - circular breeding - literally moving the males from cage to cage - semi-random breeding- buy commercial mice whose genetics are not well known QTL mapping susceptibility to ASpergillosis in the collaborative cross

- the usefulness of a collaborative cross - genome scan of susceptibility to aspergillus - can be dangerous if you’re immunocompromised

- horizontal lines corresponds to different thresholds of genome-wide significance. Top line is 5% - can map QTLs nicely with this population Rat heterogenous stock - 8 inbred founders - rotational breeding for 50 generations - genotyped and phenotyped for many measures eg healing, anxiety, blood biochemistry - haplotype reconstruction: haplotypes of the offspring based on blocks of original 8. Used SNPs and HMMs - could then test whether an observable trait was involved with QTLs - ie do they have both. Lots of genetic background. This would mostly increase or decrease a phentoype rather than being sole in causing it - 8 founder strains seuqenced at Hubrecht Inst., Netherlands - SOLID platform used, 22x coverage - 7.4M SNPs, 0.7M indels, 0.33M structural variants - imputed in the HS animals using the haplotype mosaics - GWAS with a Mixed Model to control for family structure - 355 QTLs identified for 118 phenotypes at 10% false discovery rate - only 22 QTLs explain more than 15% of variance - only 10 were major QTLs - genetic relationship matrix: way of visualising heterogenous stocks - measuing genetic similarity: AA and AA= 2, AA TT IBS= 0 etc - Did a GWAS on heterogenous stock (HS) rats: for a diabetes related trait and got some hits - Rats HS stilll have missing heritability like humans - found CTND2 when looking for anxiety variants - chromosome 2, associated with behaviour and espressed in the brain - found some sequence variants for heart weight, congregate around SHank2 Identifying causative variants - 66% of QTLs cannot be explained by single variants - likely explanation: multiple causal variants - eg they looked at glucose tolerance and found that the haplotype was more significant than any imputed sequence variants. Therefore QTL is probably haplotype based since no single variant can explain the variation - around half of the QTLs we map aren’t explained by a single causal varaint Epigenetics - can mean: - reversible impacts of environment on the genome eg by DNA methylation to silence gene expression - parental imprinting eg where certain genes are always expressed on either the maternal or paternal allele - in neither case does DNA variation cause phenotypic variation - arabidopsis can be used to investigate the impact of differential DNA methylation on complex traits Arabidopsis epiRILs - epiRILs - a population of Arabidopsis epigenetic recombinant inbred lines that have almost identical DNA sequences but segregate many differences in DNA methylation - to derive this population, a homozygote for the ddm1-2 mutation is crossed with near-isogenic WT - the ddm1-2 mutation leads to a loss of DNA methylation and silencing over transposable elements (TEs)

- some DNA methylation and expression changes induced by ddm1-2 are inherited indepednently of the mutation

- therefore recombinant inbred lines formed from a WT x ddm1-2 cross have little degregating DNA variation but have variable methylation patterns comprising alternating blocks of high and low methylation, depending on the ancestry of the block epiQTL mapping in arabidopsis epiRILs - genotype lines, but measuring methylation not sequences - genotype is methylation in a certain place - map as if measuring genotypes - can map QTLs - methylation also affected by environment - want to do it partly based on season/climate - can then map QTLs - example paper Cortijo et al., 2014 Gene knockouts - Major model organisms have libraries of gene knockouts covering most of the gene space - knockouts can be ordered from stock centres - convenient for worms (frozen) and arabidopsis (seeds), yeasts - usually cheaper and faster than making knockouts from scratch eg by CRISPR/Cas9 but this may change - using the same gene knockout for different experiments means that the phenotype data can be accumulated in public databases - in the mouse a systematic phenotyping effort is under way (Hrabe de Angelis et al., 2015) Reverse and forward genetics - forward genetics aims to identify those genes responsible for a given phenotype - going from phenotype to gene - GWAS, QTL mapping are forms of forward genetics that examine natural variation - mutagenesis is another form of forward genetics where random mutations are induced by radiation or chemical means, and then the causal umtations are discovered eg by whole genome sequencing - makes the implicit assumption that DNA varations can be assigned to genes - reverse genetics aims to discover the function of a gene from the phenotypic effects of specific engineered gene sequences. Ie knock it out, see what happens Gene knockouts - are the effects of knockout single genes whilst keeping the genetic background fixed comparable to those of inbred strains in which millions of noncoding differences segregate? - classical inbred strains of mice differ at 4 million SNPs - Wild-derived strains differ at around 17 million SNPs including over 200 “natural” knockouts - inbred strains are subject to selection, removing deleterious variations - 30% of all mouse knockouts decrease bodyweight Why not just use humans - UK Biobank - recruited 500,000 people aged between 40-70 from across the UK - subjects have undergone measures, provided blood, urine and saliva samples for future analysis, given detailed information about themselves and agreed to have their health followed - genetic data, which includes genotyping and imputed data (73 million SNPs, indels and large structural variants) on 150,000 participants was made available in may 2015 - genotypes will be made available from all 500,000 participants in Q2 2016 - Access to genotypes, phenotypes and samples is assessed by UK biobank committee - Human cell lines - HGDP-CEPH Human Genome Diversity Cell Line Panel - from 1050 individuals in 52 world populations

- banked at the Foundation Jean Dausset-CEPH in Paris - can be used for cell-based assays - eg ENCODE project

- Ethically impossible to perform many experiments on humans - animal experimentation on vertebrates and cephalopods is tightly regulated in the UK - scientifically impossible to replicate experiments on humans under different conditions whilst keeping genetics constant

- many scientific questions are best answered using simplified models, or where genetic manipulation is easy

- relevance of a particular model to a human phenotype varies - care must be taken to choose the right model eg worms and flies don’t use DNA methylation...


Similar Free PDFs