Subsequence
This article has multiple issues.Please helpimprove itor discuss these issues on thetalk page.(Learn how and when to remove these messages)
|
Inmathematics,asubsequenceof a givensequenceis a sequence that can be derived from the given sequence by deleting some or no elements without changing the order of the remaining elements. For example, the sequenceis a subsequence ofobtained after removal of elementsandThe relation of one sequence being the subsequence of another is apreorder.
Subsequences can contain consecutive elements which were not consecutive in the original sequence. A subsequence which consists of a consecutive run of elements from the original sequence, such asfromis asubstring.The substring is a refinement of the subsequence.
The list of all subsequences for the word "apple"would be"a","ap","al","ae","app","apl","ape","ale","appl","appe","aple","apple","p","pp","pl","pe","ppl","ppe","ple","pple","l","le","e",""(empty string).
Common subsequence
[edit]Given two sequencesanda sequenceis said to be acommon subsequenceofandifis a subsequence of bothandFor example, if thenis said to be a common subsequence ofand
This wouldnotbe thelongest common subsequence,sinceonly has length 3, and the common subsequencehas length 4. The longest common subsequence ofandis
Applications
[edit]Subsequences have applications tocomputer science,[1]especially in the discipline ofbioinformatics,where computers are used to compare, analyze, and storeDNA,RNA,andproteinsequences.
Take two sequences of DNA containing 37 elements, say:
- SEQ1= ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA
- SEQ2= CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA
The longest common subsequence of sequences 1 and 2 is:
- LCS(SEQ1,SEQ2)=CGTTCGGCTATGCTTCTACTTATTCTA
This can be illustrated by highlighting the 27 elements of the longest common subsequence into the initial sequences:
- SEQ1= ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA
- SEQ2=CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA
Another way to show this is toalignthe two sequences, that is, to position elements of the longest common subsequence in a same column (indicated by the vertical bar) and to introduce a special character (here, a dash) for padding of arisen empty subsequences:
- SEQ1= ACGGTGTCGTGCTAT-G--C-TGATGCTGA--CT-T-ATATG-CTA-
- | || ||| ||||| | | | | || | || | || | |||
- SEQ2= -C-GT-TCG-GCTATCGTACGT--T-CT-ATTCTATGAT-T-TCTAA
Subsequences are used to determine how similar the two strands of DNA are, using the DNA bases:adenine,guanine,cytosineandthymine.
Theorems
[edit]- Every infinite sequence ofreal numbershas an infinitemonotonesubsequence (This is a lemma used in theproof of the Bolzano–Weierstrass theorem).
- Every infinitebounded sequenceinhas aconvergentsubsequence (This is theBolzano–Weierstrass theorem).
- For allintegersandevery finite sequence of length at leastcontains a monotonically increasing subsequence of lengthora monotonically decreasing subsequence of length(This is theErdős–Szekeres theorem).
- A metric spaceis compact if every sequence inhas a convergent subsequence whose limit is in.
See also
[edit]- Subsequential limit– The limit of some subsequence
- Limit superior and limit inferior– Bounds of a sequence
- Longest increasing subsequence problem– Computer science problem
Notes
[edit]- ^In computer science,stringis often used as a synonym forsequence,but it is important to note thatsubstringandsubsequenceare not synonyms. Substrings areconsecutiveparts of a string, while subsequences need not be. This means that a substring of a string is always a subsequence of the string, but a subsequence of a string is not always a substring of the string, see:Gusfield, Dan (1999) [1997].Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology.USA: Cambridge University Press. p. 4.ISBN0-521-58519-8.
This article incorporates material from subsequence onPlanetMath,which is licensed under theCreative Commons Attribution/Share-Alike License.