In this post we will look at random partitions. A partition $\pi$ of $\mathbb{N}$ is a set of disjoint subsets $\{\pi_i\}_{i=1}^\infty$ of $\mathbb{N}$ such that $\bigcup_i \pi_i=\mathbb{N}$. We arrange these sets $\pi_i$ in the order of their least element, so that $\inf \pi_1 < \inf \pi_2< \dots$, and if there are only finitely many subsets, we trail the sequence with emptysets, e.g. $(\{1\}, \mathbb{N}\backslash \{1\},\emptyset,\dots)$. Denote the set of partitions of $\mathbb{N}$ by $\mathcal{P}_\infty$.

Notice that each $\pi \in \mathcal{P}_\infty$ induces an equivalence relation $\buildrel \pi \over \sim$ on $\mathbb{N}$ by $i \buildrel \pi \over \sim j$ if and only if $i$ and $j$ belong to the same block of $\pi$. Now let $\sigma$ be a permutation of $\mathbb{N}$, and we define a partition $\sigma \pi$ by saying that $i$ and $j$ are in the same block of $\sigma \pi$ if and only if $\sigma(i) \buildrel \pi \over \sim \sigma(j)$.

Suppose now that $\pi$ is a random partition. We say that $\pi$ is exchangeable if for each permutation $\sigma$ which changes finitely many elements, we have that $\sigma\pi$ has the same distribution as $\pi$.

There is a wonderful theorem by Kingman which will follow shortly but for now let us look at some basic properties of random exchangeable partitions.

Firstly I will show that if there are any singletons, then there are infinitely many of them a.s. Suppose that for some $i$, we have that under $\pi$ this $i$ is a singleton with probability $p>0$. Now pick any $j \in \mathbb{N}$, and consider the permutation $\sigma^{i,j}$ which swaps $i$ and $j$ and fixes everything else. By the virtue of exchangeability we see that as $\sigma^{i,j}\pi$ has the same law as $\pi$, now $j$ has probability $p>0$ of being a singleton. Thus every natural number has probability $p>0$ of being a singleton, which by the Borel-Cantelli lemma we have that there are infinity many singletons. The singletons of a partition are referred to as dust.

Another interesting area is to look at the asymptotic frequency of a block $\pi_i$ of $\pi$ which is defined by

$|\pi_i|=\displaystyle \lim_{n\rightarrow \infty} \frac{\# (\pi_i \cap \{1,2,\dots,n\})}{n}$

where $\# B$ is the number of elements of a set $B \subset \mathbb{N}$. We will soon see that these quantities exist but for now if they exist let us denote the decreasing sequence of them by $|\pi|^\downarrow$ so that $|\pi|^\downarrow_1$ is the largest asymptotic frequency that any blocks of $\pi$ posses.

It also turns out there is an elegant way to construct an exchangeable random partition using the paintbox method. Let $\mathcal{P}_\mathbf{m}:=\{\mathbf{s} \in [0,1]^\mathbb{N}: s_1 \geq s_2 \geq \dots \, ; \sum_{i=1}^\infty s_i \leq 1\}$ which are called the mass partitions. The parameter $s_0:=1-\sum_{i=1}^\infty s_i$ will play the role of dust, so it is useful to allow for strict inequality when defining mass partitions.

Now pick an $\mathbf{s}\in\mathcal{P}_\mathbf{m}$ and slice up the interval $[0,1]$ into segments of length $s_1, s_2,\dots$ and so forth, i.e. we have intervals $[0,s_1),[s_1,s_1+s_2), \dots$ and if $s_0>0$, label the remaining interval dust (if $s_0>0$, then this partitioning mechanism doesn’t cover the whole of $[0,1]$). Suppose now we start dropping i.i.d. uniform random variables $\{U_i\}_{i=1}^\infty$ on to $[0,1]$, we say that

$i \sim j$ if and only if $U_i$ and $U_j$ fall in to the same segment constructed above which is not the dust segment.

So why is this partition exchangeable? Well if $\sigma$ is a permutation which fixes all but finitely many points, then as the sequence $\{U_i\}_{i=1}^\infty$ is i.i.d., the probability that $U_i$ and $U_j$  fall in the same block for $i \neq j$ is the same as any other pair. We denote by $\rho_\mathbf{s}$ the law of the exchangeable random partition obtained by $\mathbf{s} \in\mathcal{P}_\mathbf{m}$. First let us prove an easy lemma about the asymptotic frequencies of paintbox partitions.

Lemma:

Let $\mathbf{s} \in \mathcal{P}_\mathbf{m}$ and denote by $\pi^\mathbf{s}$ the random partition with law $\rho_\mathbf{s}$, obtained by the paintbox construction above. Then the asymptotic frequencies of $\pi^\mathbf{s}$ exist a.s. and moreover $|\pi^\mathbf{s}|^\downarrow= \mathbf{s}$.

Proof:

Consider a set $A=[\sum_{j=1}^{k-1} s_j,\sum_{j=1}^k s_{j})$ for some $k \in\mathbb{N}$, and let $\{U_i\}_{i=1}^\infty$ be sequence of i.i.d. uniform random variables. Then the sequence of random variables $(\mathbf{1}_{\{U_i \in A\}})$ are i.i.d. with mean $|A|=s_k$. Thus the block which is generated by this interval has asymptotic frequency given by

$\displaystyle \lim_{n\rightarrow \infty} \frac{1}{n}\displaystyle\sum_{i=1}^n\mathbf{1}_{\{U_i \in A\}} = |A|$

by the strong law of large numbers.

$\square$

A natural question to ask would be that if we are given a random $\pi \in \mathcal{P}_\infty$, can we find an $\mathbf{s}\in \mathcal{P}_\mathbf{m}$ such that $\pi$ is distributed $\rho_\mathbf{s}$? It turns out that it is not that simple, however Kingman proved the following result.

Theorem: (Kingman’s Correspondence)

Let $\pi$ be an exchangeable random partition, then $\pi$ possesses asymptotic frequencies and moreover

$\mathbb{P}(\pi\in \cdot)=\int_{\mathcal{P}_\mathbf{m}} \mathbb{P}(|\pi|^\downarrow \in d\mathbf{s})\rho_\mathbf{s}(\cdot)$.

This theorem tells us that an exchangeable random partition is a mixture of paintboxes. The original proof of it by Kingman relied on a martingale argument but we shall present a proof first given by Aldous. This proof relies on de Finetti’s theorem on exchangeable random variables.

Theorem: (de Finetti)

Let $\{\xi_i\}_{i=1}^\infty$ be a sequence of exchangeable random variables, i.e. for every permutation $\sigma$, the joint distribution of $(\xi_1,\xi_2,\dots)$ is the same as $(\xi_{\sigma(1)},\xi_{\sigma(2)},\dots)$. Then there exists a random probability measure $\mu$, such that conditionally on $\mu$, the random variables $\xi_1,\xi_2,\dots$ are i.i.d. with law $\mu$.

This notion of exchangeable random variables was introduced to weaken the assumptions of being i.i.d. One can immediately see that if $\xi_i$ are i.i.d. then they are exchangeable and the theorem is trivial (the random measure is not random). The proof for the general case can be found in most textbooks and we shall skip it here. The proof for Kingman’s correspondence I give here is from Bertoin (see further reading below).

Proof of Kingman’s Correspondence:

For any partition $\pi \in \mathcal{P}_\infty$ we say that $b:\mathbb{N}\rightarrow\mathbb{N}$ is a selection map if it maps each element of a block of $\pi$ to the same point in that block. For example $b(i)$ is the infimum of the block containing $i$. Let $\{U_i\}_{i=1}^\infty$ be a sequence of i.i.d. uniform random variables, independent of everything else and define $\xi_i:=U_{b(i)}$. Then we claim that $\{\xi_i\}_{i=1}^\infty$ is an exchangeable sequence of random variables. So let $\sigma$ be a permutation and $b':=\sigma^{-1}\circ b \circ\sigma$.

On the one hand $b'$ is a selection map for $\sigma \pi$ and on the other we have that $\xi_{\sigma(i)}=U_{b(\sigma(i))}=U'_{b'(i)}$ where $U'_i=U_{\sigma(i)}$. The random variables $\{U'_i\}_{i=1}^\infty$ are i.i.d. uniforms, independent of $\sigma\pi$ and $b'$ and so have the joint laws of $(\{U_i\}_{i=1}^\infty,\pi)$ and $(\{U'_i\}_{i=1}^\infty,\sigma\pi)$ are the same. Thus the sequence $\xi_1,\dots$ is exchangeable.

By de Finetti’s theorem there exists a random measure $\mu$ on $[0,1]$ such that conditionally on $\mu$, $\xi_1,\dots$ are i.i.d. with law $\mu$. Define

$q(s):=\inf\{x \in [0,1]: \mu([0,x])\geq s\}$.

Let $\{V_i\}_{i=1}^\infty$ be an i.i.d. sequence of uniform random variables on $[0,1]$, which is independent of everything else. Then conditionally on $\mu$, the law of $(q(V_1),\dots)$ corresponds with the law of $(\xi_1,\dots)$. Now $i \buildrel \pi \over \sim j$ if and only if $\xi_i=\xi_j$, which has the same probability as $q(V_i)=q(V_j)$. This only happens when $V_i$ and $V_j$ fall in to the same flat component of $q$. More precisely let partition $[0,1]$ in to intervals $I$ where for each $x,y \in I$ we have $q(x)=q(y)$ and label the remaining intervals dust. Notice now that conditionally on $\mu$

$i \buildrel \pi \over \sim j$ if and only if $V_i$ and $V_j$ fall in to the same interval which is not the dust interval

which is a paintbox construction.

$\square$