Jump to content

Molecular clock

From Wikipedia, the free encyclopedia

Themolecular clockis a figurative term for a technique that uses themutation rateofbiomoleculestodeduce the timeinprehistorywhen two or morelife formsdiverged.The biomolecular data used for such calculations are usuallynucleotidesequencesforDNA,RNA,oramino acidsequences forproteins.

Early discovery and genetic equidistance

[edit]

The notion of the existence of a so-called "molecular clock" was first attributed toÉmile ZuckerkandlandLinus Paulingwho, in 1962, noticed that the number ofamino aciddifferences inhemoglobinbetween different lineages changes roughlylinearlywith time, as estimated from fossil evidence.[1]They generalized this observation to assert that the rate ofevolutionarychange of any specifiedproteinwas approximately constant over time and over different lineages (known as themolecular clock hypothesis).

Thegenetic equidistancephenomenon was first noted in 1963 byEmanuel Margoliash,who wrote: "It appears that the number of residue differences betweencytochrome cof any two species is mostly conditioned by the time elapsed since the lines of evolution leading to these two species originally diverged. If this is correct, the cytochrome c of all mammals should be equally different from the cytochrome c of all birds. Since fish diverges from the main stem of vertebrate evolution earlier than either birds or mammals, the cytochrome c of both mammals and birds should be equally different from the cytochrome c of fish. Similarly, all vertebrate cytochrome c should be equally different from the yeast protein. "[2]For example, the difference between the cytochrome c of a carp and a frog, turtle, chicken, rabbit, and horse is a very constant 13% to 14%. Similarly, the difference between the cytochrome c of a bacterium and yeast, wheat, moth, tuna, pigeon, and horse ranges from 64% to 69%. Together with the work of Emile Zuckerkandl and Linus Pauling, the genetic equidistance result led directly to the formal postulation of the molecular clock hypothesis in the early 1960s.[3]

Similarly,Vincent SarichandAllan Wilsonin 1967 demonstrated that molecular differences among modernPrimatesinalbuminproteins showed that approximately constant rates of change had occurred in all the lineages they assessed.[4]The basic logic of their analysis involved recognizing that if one species lineage had evolved more quickly than a sister species lineage since their common ancestor, then the molecular differences between an outgroup (more distantly related) species and the faster-evolving species should be larger (since more molecular changes would have accumulated on that lineage) than the molecular differences between the outgroup species and the slower-evolving species. This method is known as therelative rate test.Sarich and Wilson's paper reported, for example, that human (Homo sapiens) and chimpanzee (Pan troglodytes) albumin immunological cross-reactions suggested they were about equally different fromCeboidea(New World Monkey) species (within experimental error). This meant that they had both accumulated approximately equal changes in albumin since their shared common ancestor. This pattern was also found for all the primate comparisons they tested. When calibrated with the few well-documented fossil branch points (such as no Primate fossils of modern aspect found before theK-T boundary), this led Sarich and Wilson to argue that the human-chimp divergence probably occurred only ~4–6 million years ago.[5]

Relationship with neutral theory

[edit]

The observation of a clock-like rate of molecular change was originally purelyphenomenological.Later, the work ofMotoo Kimura[6]developed theneutral theory of molecular evolution,which predicted a molecular clock. Let there be N individuals, and to keep this calculation simple, let the individuals behaploid(i.e. have one copy of each gene). Let the rate of neutralmutations(i.e. mutations with no effect onfitness) in a new individual be.The probability that this new mutation will becomefixedin the population is then 1/N, since each copy of the gene is as good as any other. Every generation, each individual can have new mutations, so there areN new neutral mutations in the population as a whole. That means that each generation,new neutral mutations will become fixed. If most changes seen duringmolecular evolutionare neutral, thenfixationsin a population will accumulate at a clock-rate that is equal to the rate of neutralmutationsin an individual.

Calibration

[edit]

To use molecular clocks to estimate divergence times, molecular clocks need to be "calibrated". This is because molecular data alone does not contain any information on absolute times. For viral phylogenetics andancient DNAstudies—two areas of evolutionary biology where it is possible to sample sequences over an evolutionary timescale—the dates of the intermediate samples can be used to calibrate the molecular clock. However, most phylogenies require that the molecular clock becalibratedusing independent evidence about dates, such as thefossilrecord.[7]There are two general methods for calibrating the molecular clock using fossils: node calibration and tip calibration.[8]

Node calibration

[edit]

Sometimes referred to as node dating, node calibration is a method for time-scalingphylogenetic treesby specifying time constraints for one or more nodes in the tree. Early methods of clock calibration only used a single fossil constraint (e.g. non-parametric rate smoothing),[9]but newer methods (BEAST[10]andr8s[11]) allow for the use of multiple fossils to calibrate molecular clocks. The oldest fossil of acladeis used to constrain the minimum possible age for the node representing the most recent common ancestor of the clade. However, due to incomplete fossil preservation and other factors, clades are typically older than their oldest fossils.[8]In order to account for this, nodes are allowed to be older than the minimum constraint in node calibration analyses. However, determining how much older the node is allowed to be is challenging. There are a number of strategies for deriving the maximum bound for the age of a clade including those based on birth-death models, fossilstratigraphicdistribution analyses, ortaphonomiccontrols.[12]Alternatively, instead of a maximum and a minimum, aprobability densitycan be used to represent the uncertainty about the age of the clade. These calibration densities can take the shape of standard probability densities (e.g.normal,lognormal,exponential,gamma) that can be used to express the uncertainty associated with divergence time estimates.[10]Determining the shape and parameters of the probability distribution is not trivial, but there are methods that use not only the oldest fossil but a larger sample of the fossil record of clades to estimate calibration densities empirically.[13]Studies have shown that increasing the number of fossil constraints increases the accuracy of divergence time estimation.[14]

Tip calibration

[edit]

Sometimes referred to astip dating,tip calibration is a method of molecular clock calibration in which fossils are treated astaxaand placed on the tips of the tree. This is achieved by creating a matrix that includes amoleculardataset for theextant taxaalong with amorphologicaldataset for both the extinct and the extant taxa.[12]Unlike node calibration, this method reconstructs the tree topology and places the fossils simultaneously. Molecular and morphological models work together simultaneously, allowing morphology to inform the placement of fossils.[8]Tip calibration makes use of all relevant fossil taxa during clock calibration, rather than relying on only the oldest fossil of each clade. This method does not rely on the interpretation of negative evidence to infer maximum clade ages.[12]

Expansion calibration

[edit]

Demographic changes in populations can be detected as fluctuations in historical coalescenteffective population sizefrom a sample of extant genetic variation in the population using coalescent theory.[15][16][17]Ancient population expansions that are well documented and dated in the geological record can be used to calibrate a rate of molecular evolution in a manner similar to node calibration. However, instead of calibrating from the known age of a node, expansion calibration uses a two-epoch model of constant population size followed by population growth, with the time of transition between epochs being the parameter of interest for calibration.[18][19]Expansion calibration works at shorter, intraspecific timescales in comparison to node calibration, because expansions can only be detected after themost recent common ancestorof the species in question. Expansion dating has been used to show that molecular clock rates can be inflated at short timescales[18](< 1 MY) due to incomplete fixation of alleles, as discussed below[20][21]

Total evidence dating

[edit]

This approach to tip calibration goes a step further by simultaneously estimating fossil placement, topology, and the evolutionary timescale. In this method, the age of a fossil can inform its phylogenetic position in addition to morphology. By allowing all aspects of tree reconstruction to occur simultaneously, the risk of biased results is decreased.[8]This approach has been improved upon by pairing it with different models. One current method of molecular clock calibration is total evidence dating paired with the fossilized birth-death (FBD) model and a model of morphological evolution.[22]The FBD model is novel in that it allows for "sampled ancestors", which are fossil taxa that are the direct ancestor of a living taxon orlineage.This allows fossils to be placed on a branch above an extant organism, rather than being confined to the tips.[23]

Methods

[edit]

Bayesian methods can provide more appropriate estimates of divergence times, especially if large datasets—such as those yielded byphylogenomics—are employed.[24]

Non-constant rate of molecular clock

[edit]

Sometimes only a single divergence date can be estimated from fossils, with all other dates inferred from that. Other sets of species have abundant fossils available, allowing the hypothesis of constant divergence rates to be tested. DNA sequences experiencing low levels ofnegative selectionshowed divergence rates of 0.7–0.8% perMyrin bacteria, mammals, invertebrates, and plants.[25]In the same study, genomic regions experiencing very high negative or purifying selection (encoding rRNA) were considerably slower (1% per 50 Myr).

In addition to such variation in rate with genomic position, since the early 1990s variation among taxa has proven fertile ground for research too,[26]even over comparatively short periods of evolutionary time (for examplemockingbirds[27]).Tube-nosed seabirdshave molecular clocks that on average run at half speed of many other birds,[28]possibly due to longgenerationtimes, and many turtles have a molecular clock running at one-eighth the speed it does in small mammals, or even slower.[29]Effects ofsmall population sizeare also likely to confound molecular clock analyses. Researchers such asFrancisco J. Ayalahave more fundamentally challenged the molecular clock hypothesis.[30][31][32]According to Ayala's 1999 study, five factors combine to limit the application of molecular clock models:

  • Changing generation times (If the rate of new mutations depends at least partly on the number of generations rather than the number of years)
  • Population size (Genetic driftis stronger in small populations, and so more mutations are effectively neutral)
  • Species-specific differences (due to differing metabolism, ecology, evolutionary history,...)
  • Change in function of the protein studied (can be avoided in closely related species by utilizingnon-coding DNAsequences or emphasizingsilent mutations)
  • Changes in the intensity of natural selection.
Phylogram showing three groups, one of which has strikingly longer branches than the two others
Woody bamboos (tribesArundinarieaeandBambuseae) have long generation times and lower mutation rates, as expressed by short branches in thephylogenetic tree,than the fast-evolving herbaceous bamboos (Olyreae).

Molecular clock users have developed workaround solutions using a number of statistical approaches includingmaximum likelihoodtechniques and laterBayesian modeling.In particular, models that take into account rate variation across lineages have been proposed in order to obtain better estimates of divergence times. These models are calledrelaxed molecular clocks[33]because they represent an intermediate position between the 'strict' molecular clock hypothesis andJoseph Felsenstein's many-rates model[34]and are made possible throughMCMCtechniques that explore a weighted range of tree topologies and simultaneously estimate parameters of the chosen substitution model. It must be remembered that divergence dates inferred using a molecular clock are based on statisticalinferenceand not on directevidence.

The molecular clock runs into particular challenges at very short and very long timescales. At long timescales, the problem issaturation.When enough time has passed, many sites have undergone more than one change, but it is impossible to detect more than one. This means that the observed number of changes is no longerlinearwith time, but instead flattens out. Even at intermediate genetic distances, with phylogenetic data still sufficient to estimate topology, signal for the overall scale of the tree can be weak under complex likelihood models, leading to highly uncertain molecular clock estimates.[35]

At very short time scales, many differences between samples do not representfixationof different sequences in the different populations. Instead, they represent alternativeallelesthat were both present as part of a polymorphism in the common ancestor. The inclusion of differences that have not yet becomefixed leads to a potentially dramatic inflation of the apparent rate of the molecular clock at very short timescales.[21][36]

Uses

[edit]

The molecular clock technique is an important tool inmolecular systematics,macroevolution,andphylogenetic comparative methods.Estimation of the dates ofphylogeneticevents, including those not documented byfossils,such as the divergences between livingtaxahas allowed the study of macroevolutionary processes in organisms that had limited fossil records. Phylogenetic comparative methods rely heavily on calibrated phylogenies.

See also

[edit]

References

[edit]
  1. ^Zuckerkandl E,Pauling(1962)."Molecular disease, evolution, and genic heterogeneity".In Kasha, M., Pullman, B (eds.).Horizons in Biochemistry.Academic Press, New York. pp.189–225.
  2. ^Margoliash E (October 1963)."Primary Structure and Evolution of Cytochrome C".Proceedings of the National Academy of Sciences of the United States of America.50(4): 672–679.Bibcode:1963PNAS...50..672M.doi:10.1073/pnas.50.4.672.PMC221244.PMID14077496.
  3. ^Kumar S (August 2005). "Molecular clocks: four decades of evolution".Nature Reviews. Genetics.6(8): 654–662.doi:10.1038/nrg1659.PMID16136655.S2CID14261833.
  4. ^Sarich VM, Wilson AC (July 1967)."Rates of albumin evolution in primates".Proceedings of the National Academy of Sciences of the United States of America.58(1): 142–148.Bibcode:1967PNAS...58..142S.doi:10.1073/pnas.58.1.142.PMC335609.PMID4962458.
  5. ^Sarich VM, Wilson AC (December 1967). "Immunological time scale for hominid evolution".Science.158(3805): 1200–1203.Bibcode:1967Sci...158.1200S.doi:10.1126/science.158.3805.1200.JSTOR1722843.PMID4964406.S2CID7349579.
  6. ^Kimura M (February 1968). "Evolutionary rate at the molecular level".Nature.217(5129): 624–626.Bibcode:1968Natur.217..624K.doi:10.1038/217624a0.PMID5637732.S2CID4161261.
  7. ^Benton MJ, Donoghue PC (January 2007)."Paleontological evidence to date the tree of life".Molecular Biology and Evolution.24(1): 26–53.doi:10.1093/molbev/msl150.PMID17047029.
  8. ^abcdDonoghue PC, Yang Z (July 2016)."The evolution of methods for establishing evolutionary timescales".Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences.371(1699): 20160020.doi:10.1098/rstb.2016.0020.PMC4920342.PMID27325838.
  9. ^Sanderson M (1997)."A nonparametric approach to estimating divergence times in the absence of rate constancy".Molecular Biology and Evolution.14(12): 1218–1231.doi:10.1093/oxfordjournals.molbev.a025731.S2CID17647010.
  10. ^abDrummond AJ, Suchard MA, Xie D, Rambaut A (August 2012)."Bayesian phylogenetics with BEAUti and the BEAST 1.7".Molecular Biology and Evolution.29(8): 1969–1973.doi:10.1093/molbev/mss075.PMC3408070.PMID22367748.
  11. ^Sanderson MJ (January 2003)."r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock".Bioinformatics.19(2): 301–302.doi:10.1093/bioinformatics/19.2.301.PMID12538260.
  12. ^abcO'Reilly JE, Dos Reis M, Donoghue PC (November 2015)."Dating Tips for Divergence-Time Estimation".Trends in Genetics.31(11): 637–650.doi:10.1016/j.tig.2015.08.001.hdl:1983/ba7bbcf4-1d51-4b74-a800-9948edb3bbe6.PMID26439502.
  13. ^Claramunt, S (2022)."CladeDate: Calibration information generator for divergence time estimation".Methods in Ecology and Evolution.13(11). Wiley: 2331–2338.Bibcode:2022MEcEv..13.2331C.doi:10.1111/2041-210x.13977.ISSN2041-210X.S2CID252353611.
  14. ^Zheng Y, Wiens JJ (April 2015). "Do missing data influence the accuracy of divergence-time estimation with BEAST?".Molecular Phylogenetics and Evolution.85(1): 41–49.doi:10.1016/j.ympev.2015.02.002.PMID25681677.S2CID3895351.
  15. ^Rogers AR, Harpending H (May 1992)."Population growth makes waves in the distribution of pairwise genetic differences".Molecular Biology and Evolution.9(3): 552–569.doi:10.1093/oxfordjournals.molbev.a040727.PMID1316531.
  16. ^Shapiro B, Drummond AJ, Rambaut A, Wilson MC, Matheus PE, Sher AV, et al. (November 2004)."Rise and fall of the Beringian steppe bison".Science.306(5701): 1561–1565.Bibcode:2004Sci...306.1561S.doi:10.1126/science.1101074.PMID15567864.S2CID27134675.
  17. ^Li H, Durbin R (July 2011)."Inference of human population history from individual whole-genome sequences".Nature.475(7357): 493–496.doi:10.1038/nature10231.PMC3154645.PMID21753753.
  18. ^abCrandall ED, Sbrocco EJ, Deboer TS, Barber PH, Carpenter KE (February 2012)."Expansion dating: calibrating molecular clocks in marine species from expansions onto the Sunda Shelf Following the Last Glacial Maximum".Molecular Biology and Evolution.29(2): 707–719.doi:10.1093/molbev/msr227.PMID21926069.
  19. ^Hoareau TB (May 2016)."Late Glacial Demographic Expansion Motivates a Clock Overhaul for Population Genetics".Systematic Biology.65(3): 449–464.doi:10.1093/sysbio/syv120.hdl:2263/53371.PMID26683588.
  20. ^Ho SY, Tong KJ, Foster CS, Ritchie AM, Lo N, Crisp MD (September 2015)."Biogeographic calibrations for the molecular clock".Biology Letters.11(9): 20150194.doi:10.1098/rsbl.2015.0194.PMC4614420.PMID26333662.
  21. ^abHo SY, Phillips MJ, Cooper A, Drummond AJ (July 2005)."Time dependency of molecular rate estimates and systematic overestimation of recent divergence times".Molecular Biology and Evolution.22(7): 1561–1568.doi:10.1093/molbev/msi145.PMID15814826.
  22. ^Heath TA, Huelsenbeck JP, Stadler T (July 2014)."The fossilized birth-death process for coherent calibration of divergence-time estimates".Proceedings of the National Academy of Sciences of the United States of America.111(29): E2957–E2966.arXiv:1310.2968.Bibcode:2014PNAS..111E2957H.doi:10.1073/pnas.1319091111.PMC4115571.PMID25009181.
  23. ^Gavryushkina A, Heath TA, Ksepka DT, Stadler T, Welch D, Drummond AJ (January 2017)."Bayesian Total-Evidence Dating Reveals the Recent Crown Radiation of Penguins".Systematic Biology.66(1): 57–73.arXiv:1506.04797.doi:10.1093/sysbio/syw060.PMC5410945.PMID28173531.
  24. ^dos Reis M, Inoue J, Hasegawa M, Asher RJ, Donoghue PC, Yang Z (September 2012)."Phylogenomic datasets provide both precision and accuracy in estimating the timescale of placental mammal phylogeny".Proceedings. Biological Sciences.279(1742): 3491–3500.doi:10.1098/rspb.2012.0683.PMC3396900.PMID22628470.
  25. ^Ochman H, Wilson AC (1987). "Evolution in bacteria: evidence for a universal substitution rate in cellular genomes".Journal of Molecular Evolution.26(1–2): 74–86.Bibcode:1987JMolE..26...74O.doi:10.1007/BF02111283.PMID3125340.S2CID8260277.
  26. ^Douzery EJ, Delsuc F, Stanhope MJ, Huchon D (2003). "Local molecular clocks in three nuclear genes: divergence times for rodents and other mammals and incompatibility among fossil calibrations".Journal of Molecular Evolution.57(Suppl 1): S201–S213.Bibcode:2003JMolE..57S.201D.CiteSeerX10.1.1.535.897.doi:10.1007/s00239-003-0028-x.PMID15008417.S2CID23887665.
  27. ^Hunt JS, Bermingham E, Ricklefs RE (2001)."Molecular systematics and biogeography of Antillean thrashers, tremblers, and mockingbirds (Aves: Mimidae)".Auk.118(1): 35–55.doi:10.1642/0004-8038(2001)118[0035:MSABOA]2.0.CO;2.ISSN0004-8038.S2CID51797284.
  28. ^Rheindt, F. E. & Austin, J. (2005)."Major analytical and conceptual shortcomings in a recent taxonomic revision of the Procellariiformes – A reply to Penhallurick and Wink (2004)"(PDF).Emu.105(2): 181–186.Bibcode:2005EmuAO.105..181R.doi:10.1071/MU04039.S2CID20390465.
  29. ^Avise JC, Bowen BW, Lamb T, Meylan AB, Bermingham E (May 1992)."Mitochondrial DNA evolution at a turtle's pace: evidence for low genetic variability and reduced microevolutionary rate in the Testudines".Molecular Biology and Evolution.9(3): 457–473.doi:10.1093/oxfordjournals.molbev.a040735.PMID1584014.
  30. ^Ayala FJ (January 1999)."Molecular clock mirages".BioEssays.21(1): 71–75.doi:10.1002/(SICI)1521-1878(199901)21:1<71::AID-BIES9>3.0.CO;2-B.PMID10070256.Archived fromthe originalon 16 December 2012.
  31. ^Schwartz, J. H. & Maresca, B. (2006). "Do Molecular Clocks Run at All? A Critique of Molecular Systematics".Biological Theory.1(4): 357–371.CiteSeerX10.1.1.534.4502.doi:10.1162/biot.2006.1.4.357.S2CID28166727.
  32. ^Pascual-García A, Arenas M, Bastolla U (November 2019). "The Molecular Clock in the Evolution of Protein Structures".Systematic Biology.68(6): 987–1002.doi:10.1093/sysbio/syz022.hdl:20.500.11850/373053.PMID31111152.
  33. ^Drummond AJ, Ho SY, Phillips MJ, Rambaut A (May 2006)."Relaxed phylogenetics and dating with confidence".PLOS Biology.4(5): e88.doi:10.1371/journal.pbio.0040088.PMC1395354.PMID16683862.
  34. ^Felsenstein J (2001). "Taking variation of evolutionary rates between sites into account in inferring phylogenies".Journal of Molecular Evolution.53(4–5): 447–455.Bibcode:2001JMolE..53..447F.doi:10.1007/s002390010234.PMID11675604.S2CID9791493.
  35. ^Marshall, D. C., et al. 2016. Inflation of molecular clock rates and dates: molecular phylogenetics, biogeography, and diversification of a global cicada radiation from Australasia (Hemiptera: Cicadidae: Cicadettini).Systematic Biology 65(1):16–34.
  36. ^Peterson GI, Masel J (November 2009)."Quantitative prediction of molecular clock and ka/ks at short timescales".Molecular Biology and Evolution.26(11): 2595–2603.doi:10.1093/molbev/msp175.PMC2912466.PMID19661199.

Further reading

[edit]
[edit]