Subsequence

Inmathematics,asubsequenceof a givensequenceis a sequence that can be derived from the given sequence by deleting some or no elements without changing the order of the remaining elements. For example, the sequence $\langle A,B,D\rangle$ is a subsequence of $\langle A,B,C,D,E,F\rangle$ obtained after removal of elements $C,$ $E,$ and $F.$ The relation of one sequence being the subsequence of another is apreorder.

Subsequences can contain consecutive elements which were not consecutive in the original sequence. A subsequence which consists of a consecutive run of elements from the original sequence, such as $\langle B,C,D\rangle,$ from $\langle A,B,C,D,E,F\rangle,$ is asubstring.The substring is a refinement of the subsequence.

The list of all subsequences for the word "apple"would be"a","ap","al","ae","app","apl","ape","ale","appl","appe","aple","apple","p","pp","pl","pe","ppl","ppe","ple","pple","l","le","e",""(empty string).

Common subsequence

Given two sequences $X$ and $Y,$ a sequence $Z$ is said to be acommon subsequenceof $X$ and $Y,$ if $Z$ is a subsequence of both $X$ and $Y.$ For example, if $X=\langle A,C,B,D,E,G,C,E,D,B,G\rangle \qquad {\text{ and}}$ $Y=\langle B,E,G,J,C,F,E,K,B\rangle \qquad {\text{ and}}$ $Z=\langle B,E,E\rangle.$ then $Z$ is said to be a common subsequence of $X$ and $Y.$

This wouldnotbe thelongest common subsequence,since $Z$ only has length 3, and the common subsequence $\langle B,E,E,B\rangle$ has length 4. The longest common subsequence of $X$ and $Y$ is $\langle B,E,G,C,E,B\rangle.$

Applications

Subsequences have applications tocomputer science,^[1]especially in the discipline ofbioinformatics,where computers are used to compare, analyze, and storeDNA,RNA,andproteinsequences.

Take two sequences of DNA containing 37 elements, say:

SEQ₁= ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA

SEQ₂= CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA

The longest common subsequence of sequences 1 and 2 is:

LCS_{(SEQ₁,SEQ₂)}=CGTTCGGCTATGCTTCTACTTATTCTA

This can be illustrated by highlighting the 27 elements of the longest common subsequence into the initial sequences:

SEQ₁= ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA

SEQ₂=CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA

Another way to show this is toalignthe two sequences, that is, to position elements of the longest common subsequence in a same column (indicated by the vertical bar) and to introduce a special character (here, a dash) for padding of arisen empty subsequences:

SEQ₁= ACGGTGTCGTGCTAT-G--C-TGATGCTGA--CT-T-ATATG-CTA-

| || ||| ||||| | | | | || | || | || | |||

SEQ₂= -C-GT-TCG-GCTATCGTACGT--T-CT-ATTCTATGAT-T-TCTAA

Subsequences are used to determine how similar the two strands of DNA are, using the DNA bases:adenine,guanine,cytosineandthymine.

Theorems

Every infinite sequence ofreal numbershas an infinitemonotonesubsequence (This is a lemma used in theproof of the Bolzano–Weierstrass theorem).
Every infinitebounded sequencein $\mathbb {R} ^{n}$ has aconvergentsubsequence (This is theBolzano–Weierstrass theorem).
For allintegers $r$ and $s,$ every finite sequence of length at least $(r-1)(s-1)+1$ contains a monotonically increasing subsequence of length $r$ ora monotonically decreasing subsequence of length $s$ (This is theErdős–Szekeres theorem).
A metric space $(X,d)$ is compact if every sequence in $X$ has a convergent subsequence whose limit is in $X$ .

Notes

^In computer science,stringis often used as a synonym forsequence,but it is important to note thatsubstringandsubsequenceare not synonyms. Substrings areconsecutiveparts of a string, while subsequences need not be. This means that a substring of a string is always a subsequence of the string, but a subsequence of a string is not always a substring of the string, see:Gusfield, Dan (1999) [1997].Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology.USA: Cambridge University Press. p. 4.ISBN 0-521-58519-8.

This article incorporates material from subsequence onPlanetMath,which is licensed under theCreative Commons Attribution/Share-Alike License.

[substrVsSubseq-1] In computer science,stringis often used as a synonym forsequence,but it is important to note thatsubstringandsubsequenceare not synonyms. Substrings areconsecutiveparts of a string, while subsequences need not be. This means that a substring of a string is always a subsequence of the string, but a subsequence of a string is not always a substring of the string, see:Gusfield, Dan (1999) [1997].Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology.USA: Cambridge University Press. p. 4.ISBN 0-521-58519-8.

[1]

Common subsequence

Applications

Theorems

See also

Notes