Rank (linear algebra)

Inlinear algebra,therankof amatrix $A$ is thedimensionof thevector spacegenerated (orspanned) by its columns.^[1]^[2]^[3]This corresponds to the maximal number oflinearly independentcolumns of $A$ .This, in turn, is identical to the dimension of the vector space spanned by its rows.^[4]Rank is thus a measure of the "nondegenerateness"of thesystem of linear equationsandlinear transformationencoded by $A$ .There are multiple equivalent definitions of rank. A matrix's rank is one of its most fundamental characteristics.

The rank is commonly denoted by $rank(A)$ or $rk(A)$ ;^[2]sometimes the parentheses are not written, as in $rank A$ .^[i]

Main definitions

In this section, we give some definitions of the rank of a matrix. Many definitions are possible; seeAlternative definitionsfor several of these.

Thecolumn rankof $A$ is thedimensionof thecolumn spaceof $A$ ,while therow rankof $A$ is the dimension of therow spaceof $A$ .

A fundamental result in linear algebra is that the column rank and the row rank are always equal. (Three proofs of this result are given in§ Proofs that column rank = row rank,below.) This number (i.e., the number of linearly independent rows or columns) is simply called therankof $A$ .

A matrix is said to havefull rankif its rank equals the largest possible for a matrix of the same dimensions, which is the lesser of the number of rows and columns. A matrix is said to berank-deficientif it does not have full rank. Therank deficiencyof a matrix is the difference between the lesser of the number of rows and columns, and the rank.

The rank of alinear mapor operator $\Phi$ is defined as the dimension of itsimage:^[5]^[6]^[7]^[8] $\operatorname {rank} (\Phi ):=\dim(\operatorname {img} (\Phi ))$ where $\dim$ is the dimension of a vector space, and $\operatorname {img}$ is the image of a map.

Examples

The matrix ${\begin{bmatrix}1&0&1\\0&1&1\\0&1&1\end{bmatrix}}$ has rank 2: the first two columns arelinearly independent,so the rank is at least 2, but since the third is a linear combination of the first two (the first column plus the second), the three columns are linearly dependent so the rank must be less than 3.

The matrix $A={\begin{bmatrix}1&1&0&2\\-1&-1&0&-2\end{bmatrix}}$ has rank 1: there are nonzero columns, so the rank is positive, but any pair of columns is linearly dependent. Similarly, thetranspose $A^{\mathrm {T} }={\begin{bmatrix}1&-1\\1&-1\\0&0\\2&-2\end{bmatrix}}$ of $A$ has rank 1. Indeed, since the column vectors of $A$ are the row vectors of thetransposeof $A$ ,the statement that the column rank of a matrix equals its row rank is equivalent to the statement that the rank of a matrix is equal to the rank of its transpose, i.e., $rank(A) = rank(A T)$ .

Computing the rank of a matrix

Rank from row echelon forms

A common approach to finding the rank of a matrix is to reduce it to a simpler form, generallyrow echelon form,byelementary row operations.Row operations do not change the row space (hence do not change the row rank), and, being invertible, map the column space to an isomorphic space (hence do not change the column rank). Once in row echelon form, the rank is clearly the same for both row rank and column rank, and equals the number ofpivots(or basic columns) and also the number of non-zero rows.

For example, the matrix $A$ given by $A={\begin{bmatrix}1&2&1\\-2&-3&1\\3&5&0\end{bmatrix}}$ can be put in reduced row-echelon form by using the following elementary row operations: ${\begin{aligned}{\begin{bmatrix}1&2&1\\-2&-3&1\\3&5&0\end{bmatrix}}&\xrightarrow {2R_{1}+R_{2}\to R_{2}} {\begin{bmatrix}1&2&1\\0&1&3\\3&5&0\end{bmatrix}}\xrightarrow {-3R_{1}+R_{3}\to R_{3}} {\begin{bmatrix}1&2&1\\0&1&3\\0&-1&-3\end{bmatrix}}\\&\xrightarrow {R_{2}+R_{3}\to R_{3}} \,\,{\begin{bmatrix}1&2&1\\0&1&3\\0&0&0\end{bmatrix}}\xrightarrow {-2R_{2}+R_{1}\to R_{1}} {\begin{bmatrix}1&0&-5\\0&1&3\\0&0&0\end{bmatrix}}~.\end{aligned}}$ The final matrix (in reduced row echelon form) has two non-zero rows and thus the rank of matrix $A$ is 2.

Computation

When applied tofloating pointcomputations on computers, basic Gaussian elimination (LU decomposition) can be unreliable, and a rank-revealing decomposition should be used instead. An effective alternative is thesingular value decomposition(SVD), but there are other less computationally expensive choices, such asQR decompositionwith pivoting (so-calledrank-revealing QR factorization), which are still more numerically robust than Gaussian elimination. Numerical determination of rank requires a criterion for deciding when a value, such as a singular value from the SVD, should be treated as zero, a practical choice which depends on both the matrix and the application.

Proofs that column rank = row rank

Proof using row reduction

The fact that the column and row ranks of any matrix are equal forms is fundamental in linear algebra. Many proofs have been given. One of the most elementary ones has been sketched in§ Rank from row echelon forms.Here is a variant of this proof:

It is straightforward to show that neither the row rank nor the column rank are changed by anelementary row operation.AsGaussian eliminationproceeds by elementary row operations, thereduced row echelon formof a matrix has the same row rank and the same column rank as the original matrix. Further elementary column operations allow putting the matrix in the form of anidentity matrixpossibly bordered by rows and columns of zeros. Again, this changes neither the row rank nor the column rank. It is immediate that both the row and column ranks of this resulting matrix is the number of its nonzero entries.

We present two other proofs of this result. The first uses only basic properties oflinear combinationsof vectors, and is valid over anyfield.The proof is based upon Wardlaw (2005).^[9]The second usesorthogonalityand is valid for matrices over thereal numbers;it is based upon Mackiw (1995).^[4]Both proofs can be found in the book by Banerjee and Roy (2014).^[10]

Proof using linear combinations

Let $A$ be an $m \times n$ matrix. Let the column rank of $A$ be $r$ ,and let $c 1,..., c r$ be any basis for the column space of $A$ .Place these as the columns of an $m \times r$ matrix $C$ .Every column of $A$ can be expressed as a linear combination of the $r$ columns in $C$ .This means that there is an $r \times n$ matrix $R$ such that $A = CR$ . $R$ is the matrix whose $i$ th column is formed from the coefficients giving the $i$ th column of $A$ as a linear combination of the $r$ columns of $C$ .In other words, $R$ is the matrix which contains the multiples for the bases of the column space of $A$ (which is $C$ ), which are then used to form $A$ as a whole. Now, each row of $A$ is given by a linear combination of the $r$ rows of $R$ .Therefore, the rows of $R$ form a spanning set of the row space of $A$ and, by theSteinitz exchange lemma,the row rank of $A$ cannot exceed $r$ .This proves that the row rank of $A$ is less than or equal to the column rank of $A$ .This result can be applied to any matrix, so apply the result to the transpose of $A$ .Since the row rank of the transpose of $A$ is the column rank of $A$ and the column rank of the transpose of $A$ is the row rank of $A$ ,this establishes the reverse inequality and we obtain the equality of the row rank and the column rank of $A$ .(Also seeRank factorization.)

Proof using orthogonality

Let $A$ be an $m \times n$ matrix with entries in thereal numberswhose row rank is $r$ .Therefore, the dimension of the row space of $A$ is $r$ .Let $x 1, x 2,\dots, x r$ be abasisof the row space of $A$ .We claim that the vectors $A x 1, A x 2,\dots, A x r$ arelinearly independent.To see why, consider a linear homogeneous relation involving these vectors with scalar coefficients $c 1, c 2,\dots, c r$ : $0=c_{1}A\mathbf {x} _{1}+c_{2}A\mathbf {x} _{2}+\cdots +c_{r}A\mathbf {x} _{r}=A(c_{1}\mathbf {x} _{1}+c_{2}\mathbf {x} _{2}+\cdots +c_{r}\mathbf {x} _{r})=A\mathbf {v},$ where $v = c 1 x 1 + c 2 x 2 + \dots + c r x r$ .We make two observations: (a) $v$ is a linear combination of vectors in the row space of $A$ ,which implies that $v$ belongs to the row space of $A$ ,and (b) since $A v = 0$ ,the vector $v$ isorthogonalto every row vector of $A$ and, hence, is orthogonal to every vector in the row space of $A$ .The facts (a) and (b) together imply that $v$ is orthogonal to itself, which proves that $v = 0$ or, by the definition of $v$ , $c_{1}\mathbf {x} _{1}+c_{2}\mathbf {x} _{2}+\cdots +c_{r}\mathbf {x} _{r}=0.$ But recall that the $x i$ were chosen as a basis of the row space of $A$ and so are linearly independent. This implies that $c 1 = c 2 = \dots = c r = 0$ .It follows that $A x 1, A x 2,\dots, A x r$ are linearly independent.

Now, each $A x i$ is obviously a vector in the column space of $A$ .So, $A x 1, A x 2,\dots, A x r$ is a set of $r$ linearly independent vectors in the column space of $A$ and, hence, the dimension of the column space of $A$ (i.e., the column rank of $A$ ) must be at least as big as $r$ .This proves that row rank of $A$ is no larger than the column rank of $A$ .Now apply this result to the transpose of $A$ to get the reverse inequality and conclude as in the previous proof.

Alternative definitions

In all the definitions in this section, the matrix $A$ is taken to be an $m \times n$ matrix over an arbitraryfield $F$ .

Dimension of image

Given the matrix $A$ ,there is an associatedlinear mapping $f:F^{n}\to F^{m}$ defined by $f(x)=Ax.$ The rank of $A$ is the dimension of the image of $f$ .This definition has the advantage that it can be applied to any linear map without need for a specific matrix.

Rank in terms of nullity

Given the same linear mapping $f$ as above, the rank is $n$ minus the dimension of thekernelof $f$ .Therank–nullity theoremstates that this definition is equivalent to the preceding one.

Column rank – dimension of column space

The rank of $A$ is the maximal number of linearly independent columns $\mathbf {c} _{1},\mathbf {c} _{2},\dots,\mathbf {c} _{k}$ of $A$ ;this is thedimensionof thecolumn spaceof $A$ (the column space being the subspace of $F m$ generated by the columns of $A$ ,which is in fact just the image of the linear map $f$ associated to $A$ ).

Row rank – dimension of row space

The rank of $A$ is the maximal number of linearly independent rows of $A$ ;this is the dimension of therow spaceof $A$ .

Decomposition rank

The rank of $A$ is the smallest integer $k$ such that $A$ can be factored as $A=CR$ ,where $C$ is an $m \times k$ matrix and $R$ is a $k \times n$ matrix. In fact, for all integers $k$ ,the following are equivalent:

the column rank of $A$ is less than or equal to $k$ ,
there exist $k$ columns $\mathbf {c} _{1},\ldots,\mathbf {c} _{k}$ of size $m$ such that every column of $A$ is a linear combination of $\mathbf {c} _{1},\ldots,\mathbf {c} _{k}$ ,
there exist an $m\times k$ matrix $C$ and a $k\times n$ matrix $R$ such that $A=CR$ (when $k$ is the rank, this is arank factorizationof $A$ ),
there exist $k$ rows $\mathbf {r} _{1},\ldots,\mathbf {r} _{k}$ of size $n$ such that every row of $A$ is a linear combination of $\mathbf {r} _{1},\ldots,\mathbf {r} _{k}$ ,
the row rank of $A$ is less than or equal to $k$ .

Indeed, the following equivalences are obvious: $(1)\Leftrightarrow (2)\Leftrightarrow (3)\Leftrightarrow (4)\Leftrightarrow (5)$ . For example, to prove (3) from (2), take $C$ to be the matrix whose columns are $\mathbf {c} _{1},\ldots,\mathbf {c} _{k}$ from (2). To prove (2) from (3), take $\mathbf {c} _{1},\ldots,\mathbf {c} _{k}$ to be the columns of $C$ .

It follows from the equivalence $(1)\Leftrightarrow (5)$ that the row rank is equal to the column rank.

As in the case of the "dimension of image" characterization, this can be generalized to a definition of the rank of any linear map: the rank of a linear map $f : V \to W$ is the minimal dimension $k$ of an intermediate space $X$ such that $f$ can be written as the composition of a map $V \to X$ and a map $X \to W$ .Unfortunately, this definition does not suggest an efficient manner to compute the rank (for which it is better to use one of the alternative definitions). Seerank factorizationfor details.

Rank in terms of singular values

The rank of $A$ equals the number of non-zerosingular values,which is the same as the number of non-zero diagonal elements in Σ in thesingular value decomposition $A=U\Sigma V^{*}$ .

Determinantal rank – size of largest non-vanishing minor

The rank of $A$ is the largest order of any non-zerominorin $A$ .(The order of a minor is the side-length of the square sub-matrix of which it is the determinant.) Like the decomposition rank characterization, this does not give an efficient way of computing the rank, but it is useful theoretically: a single non-zero minor witnesses a lower bound (namely its order) for the rank of the matrix, which can be useful (for example) to prove that certain operations do not lower the rank of a matrix.

A non-vanishing $p$ -minor ( $p \times p$ submatrix with non-zero determinant) shows that the rows and columns of that submatrix are linearly independent, and thus those rows and columns of the full matrix are linearly independent (in the full matrix), so the row and column rank are at least as large as the determinantal rank; however, the converse is less straightforward. The equivalence of determinantal rank and column rank is a strengthening of the statement that if the span of $n$ vectors has dimension $p$ ,then $p$ of those vectors span the space (equivalently, that one can choose a spanning set that is asubsetof the vectors): the equivalence implies that a subset of the rows and a subset of the columns simultaneously define an invertible submatrix (equivalently, if the span of $n$ vectors has dimension $p$ ,then $p$ of these vectors span the spaceandthere is a set of $p$ coordinates on which they are linearly independent).

Tensor rank – minimum number of simple tensors

The rank of $A$ is the smallest number $k$ such that $A$ can be written as a sum of $k$ rank 1 matrices, where a matrix is defined to have rank 1 if and only if it can be written as a nonzero product $c\cdot r$ of a column vector $c$ and a row vector $r$ .This notion of rank is calledtensor rank;it can be generalized in theseparable modelsinterpretation of thesingular value decomposition.

Properties

We assume that $A$ is an $m \times n$ matrix, and we define the linear map $f$ by $f (x) = A x$ as above.

The rank of an $m \times n$ matrix is anonnegative integerand cannot be greater than either $m$ or $n$ .That is, $\operatorname {rank} (A)\leq \min(m,n).$ A matrix that has rank $min(m, n)$ is said to havefull rank;otherwise, the matrix isrank deficient.
Only azero matrixhas rank zero.
$f$ isinjective(or "one-to-one" ) if and only if $A$ has rank $n$ (in this case, we say that $A$ hasfull column rank).
$f$ issurjective(or "onto" ) if and only if $A$ has rank $m$ (in this case, we say that $A$ hasfull row rank).
If $A$ is a square matrix (i.e., $m = n$ ), then $A$ isinvertibleif and only if $A$ has rank $n$ (that is, $A$ has full rank).
If $B$ is any $n \times k$ matrix, then $\operatorname {rank} (AB)\leq \min(\operatorname {rank} (A),\operatorname {rank} (B)).$
If $B$ is an $n \times k$ matrix of rank $n$ ,then $\operatorname {rank} (AB)=\operatorname {rank} (A).$
If $C$ is an $l \times m$ matrix of rank $m$ ,then $\operatorname {rank} (CA)=\operatorname {rank} (A).$
The rank of $A$ is equal to $r$ if and only if there exists an invertible $m \times m$ matrix $X$ and an invertible $n \times n$ matrix $Y$ such that $XAY={\begin{bmatrix}I_{r}&0\\0&0\\\end{bmatrix}},$ where $I r$ denotes the $r \times r$ identity matrix.
Sylvester’s rank inequality:if $A$ is an $m \times n$ matrix and $B$ is $n \times k$ ,then^[ii] $\operatorname {rank} (A)+\operatorname {rank} (B)-n\leq \operatorname {rank} (AB).$ This is a special case of the next inequality.
The inequality due toFrobenius:if $AB$ , $ABC$ and $BC$ are defined, then^[iii] $\operatorname {rank} (AB)+\operatorname {rank} (BC)\leq \operatorname {rank} (B)+\operatorname {rank} (ABC).$
Subadditivity: $\operatorname {rank} (A+B)\leq \operatorname {rank} (A)+\operatorname {rank} (B)$ when $A$ and $B$ are of the same dimension. As a consequence, a rank- $k$ matrix can be written as the sum of $k$ rank-1 matrices, but not fewer.
The rank of a matrix plus thenullityof the matrix equals the number of columns of the matrix. (This is therank–nullity theorem.)
If $A$ is a matrix over thereal numbersthen the rank of $A$ and the rank of its correspondingGram matrixare equal. Thus, for real matrices $\operatorname {rank} (A^{\mathrm {T} }A)=\operatorname {rank} (AA^{\mathrm {T} })=\operatorname {rank} (A)=\operatorname {rank} (A^{\mathrm {T} }).$ This can be shown by proving equality of theirnull spaces.The null space of the Gram matrix is given by vectors $x$ for which $A^{\mathrm {T} }A\mathbf {x} =0.$ If this condition is fulfilled, we also have $0=\mathbf {x} ^{\mathrm {T} }A^{\mathrm {T} }A\mathbf {x} =\left|A\mathbf {x} \right|^{2}.$ ^[11]
If $A$ is a matrix over thecomplex numbersand ${\overline {A}}$ denotes the complex conjugate of $A$ and $A *$ the conjugate transpose of $A$ (i.e., theadjointof $A$ ), then $\operatorname {rank} (A)=\operatorname {rank} ({\overline {A}})=\operatorname {rank} (A^{\mathrm {T} })=\operatorname {rank} (A^{*})=\operatorname {rank} (A^{*}A)=\operatorname {rank} (AA^{*}).$

Applications

One useful application of calculating the rank of a matrix is the computation of the number of solutions of asystem of linear equations.According to theRouché–Capelli theorem,the system is inconsistent if the rank of theaugmented matrixis greater than the rank of thecoefficient matrix.If on the other hand, the ranks of these two matrices are equal, then the system must have at least one solution. The solution is unique if and only if the rank equals the number of variables. Otherwise the general solution has $k$ free parameters where $k$ is the difference between the number of variables and the rank. In this case (and assuming the system of equations is in the real or complex numbers) the system of equations has infinitely many solutions.

Incontrol theory,the rank of a matrix can be used to determine whether alinear systemiscontrollable,orobservable.

In the field ofcommunication complexity,the rank of the communication matrix of a function gives bounds on the amount of communication needed for two parties to compute the function.

Generalization

There are different generalizations of the concept of rank to matrices over arbitraryrings,where column rank, row rank, dimension of column space, and dimension of row space of a matrix may be different from the others or may not exist.

Thinking of matrices astensors,thetensor rankgeneralizes to arbitrary tensors; for tensors of order greater than 2 (matrices are order 2 tensors), rank is very hard to compute, unlike for matrices.

There is a notion ofrankforsmooth mapsbetweensmooth manifolds.It is equal to the linear rank of thederivative.

Matrices as tensors

Matrix rank should not be confused withtensor order,which is called tensor rank. Tensor order is the number of indices required to write atensor,and thus matrices all have tensor order 2. More precisely, matrices are tensors of type (1,1), having one row index and one column index, also called covariant order 1 and contravariant order 1; seeTensor (intrinsic definition)for details.

The tensor rank of a matrix can also mean the minimum number ofsimple tensorsnecessary to express the matrix as a linear combination, and that this definition does agree with matrix rank as here discussed.

Notes

^Alternative notation includes $\rho (\Phi )$ fromKatznelson & Katznelson (2008,p. 52, §2.5.1) andHalmos (1974,p. 90, § 50).
^Proof: Apply therank–nullity theoremto the inequality $\dim \ker(AB)\leq \dim \ker(A)+\dim \ker(B).$
^Proof. The map $C:\ker(ABC)/\ker(BC)\to \ker(AB)/\ker(B)$ is well-defined and injective. We thus obtain the inequality in terms of dimensions of kernel, which can then be converted to the inequality in terms of ranks by therank–nullity theorem. Alternatively, if $M$ is a linear subspace then $\dim(AM)\leq \dim(M)$ ;apply this inequality to the subspace defined by the orthogonal complement of the image of $BC$ in the image of $B$ ,whose dimension is $\operatorname {rank} (B)-\operatorname {rank} (BC)$ ;its image under $A$ has dimension $\operatorname {rank} (AB)-\operatorname {rank} (ABC)$ .

References

^Axler (2015)pp. 111-112, §§ 3.115, 3.119
^^a ^bRoman (2005)p. 48, § 1.16
^Bourbaki,Algebra,ch. II, §10.12, p. 359
^^a ^bMackiw, G. (1995), "A Note on the Equality of the Column and Row Rank of a Matrix",Mathematics Magazine,68(4): 285–286,doi:10.1080/0025570X.1995.11996337
^Hefferon (2020)p. 200, ch. 3, Definition 2.1
^Katznelson & Katznelson (2008)p. 52, § 2.5.1
^Valenza (1993)p. 71, § 4.3
^Halmos (1974)p. 90, § 50
^ Wardlaw, William P. (2005), "Row Rank Equals Column Rank",Mathematics Magazine,78(4): 316–318,doi:10.1080/0025570X.2005.11953349,S2CID 218542661
^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics,Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
^Mirsky, Leonid (1955).An introduction to linear algebra.Dover Publications.ISBN 978-0-486-66434-7.

Sources

Axler, Sheldon(2015).Linear Algebra Done Right.Undergraduate Texts in Mathematics(3rd ed.).Springer.ISBN 978-3-319-11079-0.
Halmos, Paul Richard(1974) [1958].Finite-Dimensional Vector Spaces.Undergraduate Texts in Mathematics(2nd ed.).Springer.ISBN 0-387-90093-4.
Hefferon, Jim(2020).Linear Algebra(4th ed.). Orthogonal Publishing L3C.ISBN 978-1-944325-11-4.
Katznelson, Yitzhak;Katznelson, Yonatan R. (2008).A (Terse) Introduction to Linear Algebra.American Mathematical Society.ISBN 978-0-8218-4419-9.
Roman, Steven(2005).Advanced Linear Algebra.Undergraduate Texts in Mathematics(2nd ed.).Springer.ISBN 0-387-24766-1.
Valenza, Robert J. (1993) [1951].Linear Algebra: An Introduction to Abstract Mathematics.Undergraduate Texts in Mathematics(3rd ed.).Springer.ISBN 3-540-94099-5.

v t e Linear algebra
Outline Glossary
Basic concepts	Scalar Vector Vector space Scalar multiplication Vector projection Linear span Linear map Linear projection Linear independence Linear combination Multilinear map Basis Change of basis Row and column vectors Row and column spaces Kernel Eigenvalues and eigenvectors Transpose Linear equations
Matrices	Block Decomposition Invertible Minor Multiplication Rank Transformation Cramer's rule Gaussian elimination Productive matrix
Bilinear	Orthogonality Dot product Hadamard product Inner product space Outer product Kronecker product Gram–Schmidt process
Multilinear algebra	Determinant Cross product Triple product Seven-dimensional cross product Geometric algebra Exterior algebra Bivector Multivector Tensor Outermorphism
Vector spaceconstructions	Dual Direct sum Function space Quotient Subspace Tensor product
Numerical	Floating-point Numerical stability Basic Linear Algebra Subprograms Sparse matrix Comparison of linear algebra libraries
Category