You are currently browsing Bati’s articles.


In this post we will be looking at probabilistic coupling, which is putting two random variables (or proceses) on the same probability space. This turns out to be a surprisingly powerful tool in probability and this post will hopefully brainwash you into agreeing with me.

Consider first this baby problem. You and I flip coins and X_n, Y_n denote our n-th coin flip. Suppose that I have probability p_1 of landing heads and you have probability p_2 \geq p_1 of landing heads. Let

T_Z:=\inf\{n\geq 3: Z_nZ_{n-1}Z_{n-2}=HHH\}

be the first times Z sees three heads in a row. Now obviously you know that \mathbb{E}[T_X]\leq \mathbb{E}[T_Y], but can you prove it?

Of course here it is possible to compute both of the expectations and show this directly, but this is rather messy and long. Instead this will serve as a baby example of coupling.

Read the rest of this entry »

Ever wonder what happens when you send your card details over the internet? How exactly does public-key encryption work? I will write a brief, non-technical introduction to these concepts and the mathematical background in them, containing some sample code and plenty of examples.

Before we describe the RSA algorithm, there is one important mathematical concept, which is prime numbers and their factorization. Recall that a number p is prime if no other number divides p other than itself and 1. For technical reasons we exclude the number 1 from being prime. So lets see some examples. Is 3 prime? Well, yes, no other number other than 3 and 1 divide it. Is 24 prime? No, because 24 = 12 \times 2.

There is a fundamental theorem in number theory which says that every number n can be uniquely written as a product of prime numbers, i.e. n = p_1^{\alpha_1} \dots p_k^{\alpha_k} where p_1, \dots, p_k are prime. So again, a few examples cannot hurt. Take the number 24. We know that 24 = 12 \times 2, but now 12 is not prime so 12 = 6 \times 2 = 3 \times 2 \times 2. Hence 24 = 3 \times 2 \times 2 \times 2 = 3 \times 2^3. That’s what a prime factorisation is, and what the theorem says is pretty basic, if a number n = 3 \times 2^3, then n = 24.

At this point now I can state what is the fundamental idea behind RSA:

Factoring a number into prime factors is much harder than checking if a number is prime!

But why is this true? There are technical reasons for this but I prefer to think along the following lines. Computers are much like humans, so imagine if a human is given the task of factorising numbers and checking if numbers are prime.

Read the rest of this entry »

The trolley dilemma is summed in two parts as follows. Suppose that a trolley is running down a hill at a fast speed, heading towards five people at the bottom of the street. When it reaches them it will surely kill all of them. You notice that there is a switch next to you that could direct the trolley to a side path where there is one man standing and once you do, it will be the one man that dies. Would you do it?

Most people would answer this question with an affirmative. Let us call this the switch scenario. The second scenario is that a trolley is again running down a hill at fast speed, aimed at five people at the bottom which it will surely kill. However this time you are standing on a bridge with a fat man next to you. If you push the fat man off the bridge the trolley will stop but kill that fat man. Would you do it?
Read the rest of this entry »

If you haven’t read my previous post, I would suggest doing so before this. We denote by \mathcal{P}_\infty the partitions of \mathbb{N}. The thing to keep in mind here is that we want to think of a coalescent process as a history of lineage. Suppose we start with the trivial partition (\{1\},\{2\},\dots) and think of each block \{i\} as a member of some population. A coalescent process \Pi=(\Pi(t) :t \geq 0) on \mathcal{P}_\infty is essentially defines ancestries, in the sense that if i and j belong to the same block of \Pi(t) for some t\geq 0, then we think of that block as the common ancestor of i and j.

With this in mind, define the operator Coag:\mathcal{P}_\infty \times \mathcal{P}_\infty \rightarrow \mathcal{P}_\infty by

Coag(\pi,\pi')_i=\bigcup_{j \in \pi'}\pi_j.

With some conditions, we can define the same operator on \mathcal{P}_{[n]}, the partitions of [n]:=\{1,\dots,n\}. So for example if \pi=(\{1,3,5\},\{2\},\{4\}) and \pi'=(\{1,3\},\{2\}), then Coag(\pi,\pi')=(\{1,3,4,5\},\{2\}). The partition \pi' tells us in this case to merge the first and third block and leave the second block alone.

Read the rest of this entry »

In this post we will look at random partitions. A partition \pi of \mathbb{N} is a set of disjoint subsets \{\pi_i\}_{i=1}^\infty of \mathbb{N} such that \bigcup_i \pi_i=\mathbb{N}. We arrange these sets \pi_i in the order of their least element, so that \inf \pi_1 < \inf \pi_2< \dots, and if there are only finitely many subsets, we trail the sequence with emptysets, e.g. (\{1\}, \mathbb{N}\backslash \{1\},\emptyset,\dots). Denote the set of partitions of \mathbb{N} by \mathcal{P}_\infty.

Notice that each \pi \in \mathcal{P}_\infty induces an equivalence relation \buildrel \pi \over \sim on \mathbb{N} by i \buildrel \pi \over \sim j if and only if i and j belong to the same block of \pi. Now let \sigma be a permutation of \mathbb{N}, and we define a partition \sigma \pi by saying that i and j are in the same block of \sigma \pi if and only if \sigma(i) \buildrel \pi \over \sim \sigma(j).

Suppose now that \pi is a random partition. We say that \pi is exchangeable if for each permutation \sigma which changes finitely many elements, we have that \sigma\pi has the same distribution as \pi.

There is a wonderful theorem by Kingman which will follow shortly but for now let us look at some basic properties of random exchangeable partitions.

Read the rest of this entry »

There is a remarkably nice proof of the Lebesgue decomposition theorem (described below) by von Neumann. This leads immediately to the Radon-Nikodym theorem.


If \mu and \nu are two finite measures on (\Omega,\mathcal{F}) then there exists a non-negative (w.r.t. both measures) measurable function f and a \mu-null set B such that

\nu(A)=\int_A f \, d\mu+ \nu(A \cap B)

for each A \in \mathcal{F}.


Let \pi:=\mu+\nu and consider the operator

T(f):=\int f\, d\nu. Read the rest of this entry »

The fundamental theorem of algebra states that \mathbb{C} is algebraically closed, that is;


For any non-constant polynomial p in \mathbb{C}, there exists a z\in \mathbb{C} such that p(z)=0.


Let B=(B_t: t \geq 0) be a Brownian motion on \mathbb{C} and suppose for a contradiction that a non-constant polynomial p does not have any zero’s. Let f:=1/p, then f is analytic and tends to 0 at infinity. Pick such that \alpha < \beta and note that \{Re f \leq \alpha\} and \{Re f \geq \beta\} contain an open set, which can be done due to the fact that f is continuous and non-constant.

Now f(B_t) is a continuous local martingale (by using Ito’s formula) and moreover it is bounded. Hence by the Martingale convergence we have that f(B_t) \rightarrow f(B)_\infty a.s. and in L^1.

This last statement is contradicted by the fact that Brownian motion is recurrent on the complex plane, in particular, it visits \{Re f \leq \alpha\} and \{Re f \geq \beta\} infinitely many times which gives that

\lim\inf f(B_t) \leq \alpha < \beta \leq \lim \sup f(B_t) a.s.

directly contradicting the Martingale convergence.

I found this little gem in Rogers and Williams.

The optional stopping theorem (OST) gives that if X is a martingale and T is a bounded stopping time, then \mathbb{E}[X_T]=\mathbb{E}[X_0].

Now take a Brownian motion B=(B_t:t \geq 0), which is a martingale, and the stopping time T=\inf\{t>0:B_t=1\}. Obviously B_T=1 and OST says that 1=\mathbb{E}[B_T]=\mathbb{E}[B_0]=0.

So what went wrong? Well quite simply, I did not check that T was actually bounded. Why is it not bounded you say? No? Ok, well I’m going to tell you anyway; it is not bounded because there are many paths that trail below 0 which never hit 1. To properly show this, one must first calculate the supremum of the Brownian motion, then work out the probability that it is less than 1.

Rather nice reminder to always check that the stopping times are bounded before applying the OST. This was provided in a lecture by Nathanael Berestycki.