Pólya urn model

Instatistics,aPólya urn model(also known as aPólya urn schemeor simply asPólya's urn), named afterGeorge Pólya,is a family ofurn modelsthat can be used to interpret many commonly usedstatistical models.

The model represents objects of interest (such as atoms, people, cars, etc.) as colored balls in anurn.In the basic Pólya urn model, the experimenter putsxwhite andyblack balls into an urn. At each step, one ball is drawn uniformly at random from the urn, and its color observed; it is then returned in the urn, and an additional ball of the same color is added to the urn.

If by random chance, more black balls are drawn than white balls in the initial few draws, it would make it more likely for more black balls to be drawn later. Similarly for the white balls. Thus the urn has a self-reinforcing property ( "the rich get richer"). It is the opposite ofsampling without replacement,where every time a particular value is observed, it is less likely to be observed again, whereas in a Pólya urn model, an observed value ismorelikely to be observed again. In a Pólya urn model, successive acts of measurement over time have less and less effect on future measurements, whereas in sampling without replacement, the opposite is true: After a certain number of measurements of a particular value, that value will never be seen again.

It is also different from sampling with replacement, where the ball is returned to the urn but without adding new balls. In this case, there is neither self-reinforcing nor anti-self-reinforcing.

Basic results

Questions of interest are the evolution of the urn population and the sequence of colors of the balls drawn out.

After $n$ draws, the probability that the urn contains $(x+n_{1})$ white balls and $(y+n_{2})$ black balls (for $0\leq n_{1},n_{2}\leq n\,\,,n_{1}+n_{2}=n$ ) is ${\binom {n}{n_{1}}}{\frac {x^{{\bar {n}}_{1}}y^{{\bar {n}}_{2}}}{(x+y)^{\bar {n}}}}$ where the overbar denotesrising factorial.This can be proved by drawing thePascal's triangleof all possible configurations.

In particular, starting with one white and one black ball (i.e., $x=y=1$ ) the probability to have any number $1\leq n_{1}+1\leq n+1$ of white balls in the urn after $n$ draws is the same, ${\frac {1}{n+1}}$ .

More generally, if the urn starts with $a_{i}$ balls of color $i$ ,with $i=1,2,...,k$ ,then after $n$ draws, the probability that the urn contains $(a_{i}+n_{i})$ balls of color $i$ is ${\binom {n}{n_{1},\cdots,n_{k}}}{\frac {\prod _{i=1}^{k}a_{i}^{{\bar {n}}_{i}}}{(\sum _{i}a_{i})^{\bar {n}}}}$ where we use themultinomial coefficient.

Conditional on the urn ending up with $(a_{i}+n_{i})$ balls of color $i$ after $n$ draws, there are ${\binom {n}{n_{1},\cdots,n_{k}}}$ different trajectories that could have led to such an end-state. The conditional probability of each trajectory is the same: ${\binom {n}{n_{1},\cdots,n_{k}}}^{-1}$ .

Interpretation

One of the reasons for interest inthis particularrather elaborate urn model (i.e. with duplication and then replacement of each ball drawn) is that it provides an example in which the count (initiallyxblack andywhite) of balls in the urn isnotconcealed, which is able to approximate the correct updating ofsubjectiveprobabilities appropriate to adifferentcase in which the original urn contentisconcealed while ordinary sampling with replacement is conducted (without the Pólya ball-duplication). Because of the simple "sampling with replacement" scheme in this second case, the urn content is nowstatic,but this greater simplicity is compensated for by the assumption that the urn content is nowunknownto an observer. ABayesian analysisof the observer's uncertainty about the urn's initial content can be made, using aparticular choiceof (conjugate) prior distribution. Specifically, suppose that an observer knows that the urn contains only identical balls, each coloured either black or white, but they do not know the absolute number of balls present, nor the proportion that are of each colour. Suppose that they hold prior beliefs about these unknowns: for them the probability distribution of the urn content is well approximated by some prior distribution for the total number of balls in the urn, and a beta prior distribution with parameters(x,y)for the initial proportion of these which are black, this proportion being (for them) considered approximately independent of the total number. Then the process of outcomes of a succession of draws from the urn (with replacement but without the duplication) hasapproximately the same probability lawas does the above Pólya scheme in which the actual urn content was not hidden from them. The approximation error here relates to the fact that an urn containing a known finite numbermof balls of course cannot have anexactlybeta-distributed unknown proportion of black balls, since the domain of possible values for that proportion are confined to being multiples of $1/m$ ,rather than having the full freedom to assume any value in the continuous unit interval, as would anexactlybeta distributed proportion. This slightly informal account is provided for reason of motivation, and can be made more mathematically precise.

This basic Pólya urn model has been generalized in many ways.

Distributions related to the Pólya urn

beta-binomial distribution:The distribution of the number of successful draws (trials), e.g. number of extractions of white ball, given $n$ draws from a Pólya urn.
Beta negative binomial distribution:The distribution of number of white balls observed until a fixed number black balls are observed.
Dirichlet-multinomial distribution(also known as themultivariate Pólya distribution): The distribution over the number of balls of each color, given $n$ draws from a Pólya urn where there are $k$ different colors instead of only two.
Dirichlet negative multinomial distribution:The distribution over the number of balls of each color until a fixed number of stopping colored balls are observed.
Martingales,theBeta-binomial distributionand thebeta distribution:Letwandbbe the number of white and black balls initially in the urn, and $w+n_{w}$ the number of white balls currently in the urn afterndraws. Then the sequence of values ${\frac {w+n_{w}}{w+b+n}}$ for $n=1,2,3,\dots$ is a normalized version of theBeta-binomial distribution.It is amartingaleand converges to thebeta distributionwhenn→ ∞.
Dirichlet process,Chinese restaurant process,Hoppe urn:Imagine a modified Pólya urn scheme as follows. We start with an urn with $\alpha$ black balls. When drawing a ball from the urn, if we draw a black ball, put the ball back along with a new ball of a new non-black color randomly generated from auniform distributionover an infinite set of available colours, and consider the newly generated color to be the "value" of the draw. Otherwise, put the ball back along with another ball of the same color, as for the standard Pólya urn scheme. The colors of an infinite sequence of draws from this modified Pólya urn scheme follow aChinese restaurant process.If, instead of generating a new color, we draw a random value from a given base distribution and use that value to label the ball, the labels of an infinite sequence of draws follow aDirichlet process.^[1]
Moran model:An urn model used to modelgenetic driftin theoreticalpopulation genetics.This is closely similar to the Pólya urn model except that, in addition to adding a new ball of the same color, a randomly drawn ball is removed from the urn. The number of balls in the urn thus remains constant. Continued sampling then leads ultimately to an urn with all balls of one color, the probability of each color being the proportion of that color in the original urn. There are variants of the Moran model that insist that the ball removed from the urn be a different ball from one originally sampled in that step, and variants that do the removal of a ball immediately after the new ball is placed in the urn, so that the new ball is one of the balls available to be removed. This makes a small difference in the time taken to reach the state in which all balls are the same color. The Moran process models genetic drift in a population with overlapping generations.

Exchangeability

Polya's Urn is a quintessential example ofan exchangeable process.

Suppose we have an urn containing $\gamma$ white balls and $\alpha$ black balls. We proceed to draw balls at random from the urn. On the $i$ -th draw, we define a random variable, $X_{i}$ ,by $X_{i}=1$ if the ball is black and $X_{i}=0$ otherwise. We then return the ball to the urn, with an additional ball of the same colour. For a given $i$ ,if we have that $X_{j}=1$ for many $j<i$ ,then it is more likely that $X_{i}=1$ ,because more black balls have been added to the urn. Therefore, these variables are notindependentof each other.

The sequence $X_{1},X_{2},X_{3},\dots$ does, however, exhibit the weaker property of exchangeability.^[2]Recall that a (finite or infinite) sequence of random variables is calledexchangeableif its joint distribution is invariant under permutations of indices.

To show exchangeability of the sequence $X_{1},X_{2},X_{3},\dots$ ,assume that $n$ balls are picked from the urn, and out of these $n$ balls, $k$ balls are black and $n-k$ are white. On the first draw the number of balls in the urn is $\gamma +\alpha$ ;on the second draw it is $\gamma +\alpha +1$ and so on. On the $i$ -th draw, the number of balls will be $\gamma +\alpha +i-1$ .The probability that we draw all $k$ black balls first, and then all $n-k$ white balls is given by

$\mathbb {P} \left(X_{1}=1,\dots,X_{k}=1,X_{k+1}=0,\dots,X_{n}=0\right)$ $={\frac {\alpha }{\gamma +\alpha }}\times {\frac {\alpha +1}{\gamma +\alpha +1}}\times \cdots \times {\frac {\alpha +k-1}{\gamma +\alpha +k-1}}\times {\frac {\gamma }{\gamma +\alpha +k}}\times {\frac {\gamma +1}{\gamma +\alpha +k+1}}\times \cdots \times {\frac {\gamma +n-k-1}{\gamma +\alpha +n-1}}$

Now we must show that if the order of black and white balls is permuted, there is no change to the probability. As in the expression above, even after permuting the draws, the $i$ th denominator will always be $\gamma +\alpha +i-1$ ,since this is the number of balls in the urn at that round.

If we see $j$ -th black ball in round $t$ ,the probability $X_{t}=1$ will be equal to ${\frac {\alpha +j-1}{\gamma +\alpha +t-1}}$ ,i.e. the numerator will be equal to $\alpha +j-1$ .With the same argument, we can calculate the probability for white balls. Therefore, for any sequence $x_{1},x_{2},x_{3},\dots$ in which $1$ occurs $k$ times and $0$ occurs $n-k$ times (i.e. a sequence with $k$ black balls and $n-k$ white balls drawn in some order) the final probability will be equal to the following expression, where we take advantage ofcommutativityof multiplication in the numerator: ${\begin{aligned}\mathbb {P} (X_{1}=x_{1},X_{2}=x_{2},...,X_{n}=x_{n})&={\frac {\prod _{i=1}^{k}\left(\alpha +i-1\right)\times \prod _{i=1}^{n-k}\left(\gamma +i-1\right)}{\prod _{i=1}^{n}\left(\gamma +\alpha +i-1\right)}}\\&={\frac {\left(\alpha +k-1\right)!\times \left(\gamma +n-k-1\right)!\times \left(\alpha +\gamma -1\right)!}{\left(\alpha -1\right)!\times \left(\gamma -1\right)!\left(\alpha +\gamma +n-1\right)!}}\end{aligned}}$ This probability is not related to the order of seeing black and white balls and only depends on the total number of white balls and the total number of black balls.^[2]

According to theDe Finetti's theorem,there must be a unique prior distribution such that the joint distribution of observing the sequence is a Bayesian mixture of the Bernoulli probabilities. It can be shown that this prior distribution is abeta distributionwith parameters $\beta \left(\cdot;\,\alpha,\,\gamma \right)$ .In De Finetti's theorem, if we replace $\pi (\cdot )$ with $\beta \left(\cdot;\,\alpha,\,\gamma \right)$ ,then we get the previous equation:^[2] ${\begin{aligned}p(X_{1}=x_{1},X_{2}=x_{2},...,X_{n}=x_{n})&=\int \theta ^{\left({\sum _{i=1}^{n}x_{i}}\right)}\times \left(1-\theta \right)^{\left(n-{\sum _{i=1}^{n}x_{i}}\right)}\,\beta \left(\theta;\alpha,\,\gamma \right)d\left(\theta \right)\\&=\int \theta ^{\left({\sum _{i=1}^{n}x_{i}}\right)}\times \left(1-\theta \right)^{\left(n-{\sum _{i=1}^{n}x_{i}}\right)}\,{\dfrac {(\alpha +\gamma -1)!}{(\alpha -1)!\,(\gamma -1)!}}\theta ^{\alpha -1}(1-\theta )^{\gamma -1}d\left(\theta \right)\\&=\int \theta ^{\left({\alpha -1+\sum _{i=1}^{n}x_{i}}\right)}\times \left(1-\theta \right)^{\left(n+\gamma -1-{\sum _{i=1}^{n}x_{i}}\right)}\,{\dfrac {(\alpha +\gamma -1)!}{(\alpha -1)!\,(\gamma -1)!}}d\left(\theta \right)\\&=\int \theta ^{\left({\alpha +k-1}\right)}\times \left(1-\theta \right)^{\left(n-k-1+\gamma \right)}\,{\dfrac {(\alpha +\gamma -1)!}{(\alpha -1)!\,(\gamma -1)!}}d\left(\theta \right)\\&={\dfrac {(\alpha +\gamma -1)!}{(\alpha -1)!\,(\gamma -1)!}}\int \theta ^{\left({\alpha +k-1}\right)}\times \left(1-\theta \right)^{\left(n-k+\gamma -1\right)}\,d\left(\theta \right)\\&={\dfrac {(\alpha +\gamma -1)!}{(\alpha -1)!\,(\gamma -1)!}}{\dfrac {\Gamma (\gamma +n-k)\Gamma (\alpha +k)}{\Gamma (\alpha +\gamma +n)}}\\&={\dfrac {\left(\alpha +k-1\right)!\times \left(\gamma +n-k-1\right)!\times \left(\alpha +\gamma -1\right)!}{\left(\alpha -1\right)!\times \left(\gamma -1\right)!\left(\alpha +\gamma +n-1\right)!}}\end{aligned}}$ In this equation $k=\sum _{i=1}^{n}x_{i}$ .

References

^Hoppe, Fred (1984). "Pólya-like urns and the Ewens' sampling formula".Journal of Mathematical Biology.20:91.doi:10.1007/BF00275863.hdl:2027.42/46944.S2CID 122994288.
^^a ^b ^cHoppe, Fred M (1984)."Polya-like urns and the Ewens' sampling formula".Journal of Mathematical Biology.20(1): 91–94.doi:10.1007/bf00275863.hdl:2027.42/46944.ISSN 0303-6812.S2CID 122994288.^{[dead link‍]}

Bibliography

N.L. Johnson and S.Kotz, (1977) "Urn Models and Their Application." John Wiley.
Hosam Mahmoud, (2008) "Pólya Urn Models." Chapman and Hall/CRC.ISBN 978-1420059830.

[1] Hoppe, Fred (1984). "Pólya-like urns and the Ewens' sampling formula".Journal of Mathematical Biology.20:91.doi:10.1007/BF00275863.hdl:2027.42/46944.S2CID 122994288.

[:022-2] Hoppe, Fred M (1984)."Polya-like urns and the Ewens' sampling formula".Journal of Mathematical Biology.20(1): 91–94.doi:10.1007/bf00275863.hdl:2027.42/46944.ISSN 0303-6812.S2CID 122994288.^{[dead link‍]}

[1]

[2]