Jump to content

Phylogenetic tree

From Wikipedia, the free encyclopedia
(Redirected fromPhylogeny)

Aphylogenetic tree,phylogenyorevolutionary treeis a graphical representation which shows theevolutionaryhistory between a set ofspeciesortaxaduring a specific time.[1][2]In other words, it is a branchingdiagramor atreeshowing the evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical or genetic characteristics. In evolutionary biology, all life on Earth is theoretically part of a single phylogenetic tree, indicatingcommon ancestry.Phylogeneticsis the study of phylogenetic trees. The main challenge is to find a phylogenetic tree representing optimal evolutionary ancestry between a set of species or taxa.Computational phylogenetics (also phylogeny inference)focuses on the algorithms involved in finding optimal phylogenetic tree in the phylogenetic landscape.[1][2]

Phylogenetic trees may be rooted or unrooted. In arootedphylogenetic tree, each node with descendants represents the inferredmost recent common ancestorof those descendants,[3]and the edge lengths in some trees may be interpreted as time estimates. Each node is called a taxonomic unit. Internal nodes are generally called hypothetical taxonomic units, as they cannot be directly observed. Trees are useful in fields of biology such asbioinformatics,systematics,andphylogenetics.Unrootedtrees illustrate only the relatedness of theleaf nodesand do not require the ancestral root to be known or inferred.

History[edit]

The idea of atree of lifearose from ancient notions of a ladder-like progression from lower into higher forms oflife(such as in theGreat Chain of Being). Early representations of "branching" phylogenetic trees include a "paleontological chart" showing the geological relationships among plants and animals in the bookElementary Geology,byEdward Hitchcock(first edition: 1840).

Charles Darwinfeatured a diagrammaticevolutionary "tree"in his 1859 bookOn the Origin of Species.Over a century later,evolutionary biologistsstill usetree diagramsto depictevolutionbecause such diagrams effectively convey the concept thatspeciationoccurs through theadaptiveandsemirandomsplitting of lineages.

The termphylogenetic,orphylogeny,derives from the twoancient greekwordsφῦλον(phûlon), meaning "race, lineage", andγένεσις(génesis), meaning "origin, source".[4][5]

Properties[edit]

Rooted tree[edit]

Rooted phylogenetic tree optimized for blind people. The lowest point of the tree is the root, which symbolizes the universal common ancestor to all living beings. The tree branches out into three main groups: Bacteria (left branch, letters a to i), Archea (middle branch, letters j to p) and Eukaryota (right branch, letters q to z). Each letter corresponds to a group of organisms, listed below this description. These letters and the description should be converted to Braille font, and printed using a Braille printer. The figure can be 3D printed by copying the png file and using Cura or other software to generate the Gcode for 3D printing.

A rooted phylogenetictree(see two graphics at top) is adirectedtree with a unique node — the root — corresponding to the (usuallyimputed) most recent common ancestor of all the entities at theleavesof the tree. The root node does not have a parent node, but serves as the parent of all other nodes in the tree. The root is therefore a node ofdegree2, while other internal nodes have a minimum degree of 3 (where "degree" here refers to the total number of incoming and outgoing edges).[citation needed]

The most common method for rooting trees is the use of an uncontroversialoutgroup—close enough to allow inference from trait data or molecular sequencing, but far enough to be a clear outgroup. Another method is midpoint rooting, or a tree can also be rooted by using a non-stationarysubstitution model.[6]

Unrooted tree[edit]

An unrooted phylogenetic tree formyosin,asuperfamilyofproteins[7]

Unrooted trees illustrate the relatedness of the leaf nodes without making assumptions about ancestry. They do not require the ancestral root to be known or inferred.[8]Unrooted trees can always be generated from rooted ones by simply omitting the root. By contrast, inferring the root of an unrooted tree requires some means of identifying ancestry. This is normally done by including an outgroup in the input data so that the root is necessarily between the outgroup and the rest of the taxa in the tree, or by introducing additional assumptions about the relative rates of evolution on each branch, such as an application of themolecular clockhypothesis.[9]

Bifurcating versus multifurcating[edit]

Both rooted and unrooted trees can be eitherbifurcatingor multifurcating. A rooted bifurcating tree has exactly two descendants arising from eachinterior node(that is, it forms abinary tree), and an unrooted bifurcating tree takes the form of anunrooted binary tree,afree treewith exactly three neighbors at each internal node. In contrast, a rooted multifurcating tree may have more than two children at some nodes and an unrooted multifurcating tree may have more than three neighbors at some nodes.[citation needed]

Labeled versus unlabeled[edit]

Both rooted and unrooted trees can be either labeled or unlabeled. A labeled tree has specific values assigned to its leaves, while an unlabeled tree, sometimes called a tree shape, defines a topology only. Some sequence-based trees built from a small genomic locus, such as Phylotree,[10]feature internal nodes labeled with inferred ancestral haplotypes.

Enumerating trees[edit]

Increase in the total number of phylogenetic trees as a function of the number of labeled leaves: unrooted binary trees (blue diamonds), rooted binary trees (red circles), and rooted multifurcating or binary trees (green: triangles). The Y-axis scale islogarithmic.

The number of possible trees for a given number of leaf nodes depends on the specific type of tree, but there are always more labeled than unlabeled trees, more multifurcating than bifurcating trees, and more rooted than unrooted trees. The last distinction is the most biologically relevant; it arises because there are many places on an unrooted tree to put the root. For bifurcating labeled trees, the total number of rooted trees is:

for,represents the number of leaf nodes.[11]

For bifurcating labeled trees, the total number of unrooted trees is:[11]

for.

Among labeled bifurcating trees, the number of unrooted trees withleaves is equal to the number of rooted trees withleaves.[2]

The number of rooted trees grows quickly as a function of the number of tips. For 10 tips, there are more thanpossible bifurcating trees, and the number of multifurcating trees rises faster, with ca. 7 times as many of the latter as of the former.

Counting trees.[11]
Labeled
leaves
Binary
unrooted trees
Binary
rooted trees
Multifurcating
rooted trees
All possible
rooted trees
1 1 1 0 1
2 1 1 0 1
3 1 3 1 4
4 3 15 11 26
5 15 105 131 236
6 105 945 1,807 2,752
7 945 10,395 28,813 39,208
8 10,395 135,135 524,897 660,032
9 135,135 2,027,025 10,791,887 12,818,912
10 2,027,025 34,459,425 247,678,399 282,137,824

Special tree types[edit]

Dendrogram of the phylogeny of some dog breeds

Dendrogram[edit]

Adendrogramis a general name for a tree, whether phylogenetic or not, and hence also for the diagrammatic representation of a phylogenetic tree.[12]

Cladogram[edit]

Acladogramonly represents a branching pattern; i.e., its branch lengths do not represent time or relative amount of character change, and its internal nodes do not represent ancestors.[13]

A chronogram ofLepidoptera.[14]In this phylogenetic tree type, branch lengths are proportional to geological time.

Phylogram[edit]

A phylogram is a phylogenetic tree that has branch lengths proportional to the amount of character change.[15]

A chronogram is a phylogenetic tree that explicitly represents time through its branch lengths.[16]

Dahlgrenogram[edit]

ADahlgrenogramis a diagram representing a cross section of a phylogenetic tree.[citation needed]

Phylogenetic network[edit]

Aphylogenetic networkis not strictly speaking a tree, but rather a more generalgraph,or adirected acyclic graphin the case of rooted networks. They are used to overcome some of thelimitationsinherent to trees.

Spindle diagram[edit]

A spindle diagram, showing the evolution of thevertebratesat class level, width of spindles indicating number of families. Spindle diagrams are often used inevolutionary taxonomy.

A spindle diagram, or bubble diagram, is often called a romerogram, after its popularisation by the American palaeontologistAlfred Romer.[17] It represents taxonomic diversity (horizontal width) againstgeological time(vertical axis) in order to reflect the variation of abundance of various taxa through time. A spindle diagram is not an evolutionary tree:[18]the taxonomic spindles obscure the actual relationships of the parent taxon to the daughter taxon[17]and have the disadvantage of involving theparaphylyof the parental group.[19] This type of diagram is no longer used in the form originally proposed.[19]

Coral of life[edit]

The Coral of Life

Darwin[20]also mentioned that thecoralmay be a more suitable metaphor than thetree.Indeed,phylogenetic coralsare useful for portraying past and present life, and they have some advantages over trees (anastomosesallowed, etc.).[19]

Construction[edit]

Phylogenetic trees composed with a nontrivial number of input sequences are constructed usingcomputational phylogeneticsmethods. Distance-matrix methods such asneighbor-joiningorUPGMA,which calculategenetic distancefrommultiple sequence alignments,are simplest to implement, but do not invoke an evolutionary model. Many sequence alignment methods such asClustalWalso create trees by using the simpler algorithms (i.e. those based on distance) of tree construction.Maximum parsimonyis another simple method of estimating phylogenetic trees, but implies an implicit model of evolution (i.e. parsimony). More advanced methods use theoptimality criterionofmaximum likelihood,often within aBayesian framework,and apply an explicit model of evolution to phylogenetic tree estimation.[2]Identifying the optimal tree using many of these techniques isNP-hard,[2]soheuristicsearch andoptimizationmethods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data.

Tree-building methods can be assessed on the basis of several criteria:[21]

  • efficiency (how long does it take to compute the answer, how much memory does it need?)
  • power (does it make good use of the data, or is information being wasted?)
  • consistency (will it converge on the same answer repeatedly, if each time given different data for the same model problem?)
  • robustness (does it cope well with violations of the assumptions of the underlying model?)
  • falsifiability (does it alert us when it is not good to use, i.e. when assumptions are violated?)

Tree-building techniques have also gained the attention of mathematicians. Trees can also be built usingT-theory.[22]

File formats[edit]

Trees can be encoded in a number of different formats, all of which must represent the nested structure of a tree. They may or may not encode branch lengths and other features. Standardized formats are critical for distributing and sharing trees without relying on graphics output that is hard to import into existing software. Commonly used formats are

Limitations of phylogenetic analysis[edit]

Although phylogenetic trees produced on the basis of sequencedgenesorgenomicdata in different species can provide evolutionary insight, these analyses have important limitations. Most importantly, the trees that they generate are not necessarily correct – they do not necessarily accurately represent the evolutionary history of the included taxa. As with any scientific result, they are subject tofalsificationby further study (e.g., gathering of additional data, analyzing the existing data with improved methods). The data on which they are based may benoisy;[23]the analysis can be confounded bygenetic recombination,[24]horizontal gene transfer,[25]hybridisationbetween species that were not nearest neighbors on the tree before hybridisation takes place,convergent evolution,andconserved sequences.

Also, there are problems in basing an analysis on a single type of character, such as a singlegeneorproteinor only on morphological analysis, because such trees constructed from another unrelated data source often differ from the first, and therefore great care is needed in inferring phylogenetic relationships among species. This is most true of genetic material that is subject to lateral gene transfer andrecombination,where differenthaplotypeblocks can have different histories. In these types of analysis, the output tree of a phylogenetic analysis of a single gene is an estimate of the gene's phylogeny (i.e. a gene tree) and not the phylogeny of thetaxa(i.e. species tree) from which these characters were sampled, though ideally, both should be very close. For this reason, serious phylogenetic studies generally use a combination of genes that come from different genomic sources (e.g., from mitochondrial or plastid vs. nuclear genomes),[26]or genes that would be expected to evolve under different selective regimes, so thathomoplasy(falsehomology) would be unlikely to result from natural selection.

When extinct species are included asterminal nodesin an analysis (rather than, for example, to constrain internal nodes), they are considered not to represent direct ancestors of any extant species. Extinct species do not typically contain high-qualityDNA.

The range of useful DNA materials has expanded with advances in extraction and sequencing technologies. Development of technologies able to infer sequences from smaller fragments, or from spatial patterns of DNA degradation products, would further expand the range of DNA considered useful.

Phylogenetic trees can also be inferred from a range of other data types, including morphology, the presence or absence of particular types of genes, insertion and deletion events – and any other observation thought to contain an evolutionary signal.

Phylogenetic networksare used when bifurcating trees are not suitable, due to these complications which suggest a more reticulate evolutionary history of the organisms sampled.

See also[edit]

References[edit]

  1. ^abKhalafvand, Tyler (2015).Finding Structure in the Phylogeny Search Space.Dalhousie University.
  2. ^abcdeFelsenstein J. (2004).Inferring PhylogeniesSinauer Associates: Sunderland, MA.
  3. ^Kinene, T.; Wainaina, J.; Maina, S.; Boykin, L. (21 April 2016)."Rooting Trees, Methods for".Encyclopedia of Evolutionary Biology:489–493.doi:10.1016/B978-0-12-800049-6.00215-8.ISBN9780128004265.PMC7149615.
  4. ^Bailly, Anatole (1981-01-01).Abrégé du dictionnaire grec français.Paris: Hachette.ISBN978-2010035289.OCLC461974285.
  5. ^Bailly, Anatole."Greek-french dictionary online".tabularium.be.Archivedfrom the original on April 21, 2014.RetrievedMarch 2,2018.
  6. ^Dang, Cuong Cao; Minh, Bui Quang; McShea, Hanon; Masel, Joanna; James, Jennifer Eleanor; Vinh, Le Sy; Lanfear, Robert (9 February 2022)."nQMaker: Estimating Time Nonreversible Amino Acid Substitution Models".Systematic Biology.71(5): 1110–1123.doi:10.1093/sysbio/syac007.PMC9366462.PMID35139203.
  7. ^Hodge T, Cope M (1 October 2000)."A myosin family tree".J Cell Sci.113(19): 3353–4.doi:10.1242/jcs.113.19.3353.PMID10984423.Archivedfrom the original on 30 September 2007.
  8. ^""Tree" Facts: Rooted versus Unrooted Trees ".Archivedfrom the original on 2014-04-14.Retrieved2014-05-26.
  9. ^W. Ford Doolittle (2002). "Uprooting the Tree of Life".Scientific American.282(2): 90–95.Bibcode:2000SciAm.282b..90D.doi:10.1038/scientificamerican0200-90.PMID10710791.No abstract available
  10. ^van Oven, Mannis; Kayser, Manfred (2009)."Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation".Human Mutation.30(2): E386–E394.doi:10.1002/humu.20921.PMID18853457.S2CID27566749.
  11. ^abcFelsenstein, Joseph (1978-03-01)."The Number of Evolutionary Trees".Systematic Biology.27(1): 27–33.doi:10.2307/2412810.ISSN1063-5157.JSTOR2412810.
  12. ^Fox, Emily."The dendrogram".coursea.Archivedfrom the original on 28 September 2017.Retrieved28 September2017.
  13. ^Mayr, Ernst (1974) "Cladistic analysis or cladistic classification?". Journal of Zoological Systematics and Evolutionary Research. 12: 94–128. doi:10.1111/j.1439-0469.1974.tb00160.x.
  14. ^Labandeira, C. C.; Dilcher, D. L.; Davis, D. R.; Wagner, D. L. (1994-12-06)."Ninety-seven million years of angiosperm-insect association: paleobiological insights into the meaning of coevolution".Proceedings of the National Academy of Sciences.91(25): 12278–12282.Bibcode:1994PNAS...9112278L.doi:10.1073/pnas.91.25.12278.ISSN0027-8424.PMC45420.PMID11607501.
  15. ^Soares, Antonio; Râbelo, Ricardo; Delbem, Alexandre (2017). "Optimization based on phylogram analysis".Expert Systems with Applications.78:32–50.doi:10.1016/j.eswa.2017.02.012.ISSN0957-4174.
  16. ^Santamaria, R.; Theron, R. (2009-05-26)."Treevolution: visual analysis of phylogenetic trees".Bioinformatics.25(15): 1970–1971.doi:10.1093/bioinformatics/btp333.PMID19470585.
  17. ^ab"Evolutionary systematics: Spindle Diagrams".Palaeos.2014-11-10.Retrieved2019-11-07.
  18. ^"Trees, Bubbles, and Hooves".A Three-Pound Monkey Brain — Biology, programming, linguistics, phylogeny, systematics...2007-11-21.Retrieved2019-11-07.
  19. ^abcPodani, János (2019-06-01)."The Coral of Life".Evolutionary Biology.46(2): 123–144.Bibcode:2019EvBio..46..123P.doi:10.1007/s11692-019-09474-w.hdl:10831/46308.ISSN1934-2845.
  20. ^Darwin, Charles (1837).Notebook B.p. 25.
  21. ^Penny, D.; Hendy, M. D.;Steel, M. A.(1992). "Progress with methods for constructing evolutionary trees".Trends in Ecology and Evolution.7(3): 73–79.doi:10.1016/0169-5347(92)90244-6.PMID21235960.
  22. ^A. Dress,K. T. Huber,and V. Moulton. 2001. Metric Spaces in Pure and Applied Mathematics.Documenta MathematicaLSU 2001:121-139
  23. ^Townsend JP, Su Z, Tekle Y (2012). "Phylogenetic Signal and Noise: Predicting the Power of a Data Set to Resolve Phylogeny".Genetics.61(5): 835–849.doi:10.1093/sysbio/sys036.PMID22389443.
  24. ^Arenas M, Posada D (2010)."The effect of recombination on the reconstruction of ancestral sequences".Genetics.184(4): 1133–1139.doi:10.1534/genetics.109.113423.PMC2865913.PMID20124027.
  25. ^Woese C (2002)."On the evolution of cells".Proc Natl Acad Sci USA.99(13): 8742–7.Bibcode:2002PNAS...99.8742W.doi:10.1073/pnas.132266999.PMC124369.PMID12077305.
  26. ^Parhi, J.; Tripathy, P.S.; Priyadarshi, H.; Mandal, S.C.; Pandey, P.K. (2019). "Diagnosis of mitogenome for robust phylogeny: A case of Cypriniformes fish group".Gene.713:143967.doi:10.1016/j.gene.2019.143967.PMID31279710.S2CID195828782.

Further reading[edit]

  • Schuh, R. T. and A. V. Z. Brower. 2009.Biological Systematics: principles and applications (2nd edn.)ISBN978-0-8014-4799-0
  • Manuel Lima,The Book of Trees: Visualizing Branches of Knowledge,2014, Princeton Architectural Press, New York.
  • MEGA,a free software to draw phylogenetic trees.
  • Gontier, N. 2011. "Depicting the Tree of Life: the Philosophical and Historical Roots of Evolutionary Tree Diagrams." Evolution, Education, Outreach 4: 515–538.
  • Jan Sapp,The New Foundations of Evolution: On the Tree of Life,2009, Oxford University Press, New York.

External links[edit]

Images[edit]

General[edit]