## Wednesday, 11 March 2015

### Lecture 24

The overhead slides I used for our last lecture are here (best downloaded and viewed with a pdf reader, rather than as a web page).

I showed you a bit of Large deviation theory and Chernoff's upper bound. The lower bound  was quoted without proof (except that in Example 24.1 we did the lower bound for the special case of $B(1,p)$). If perhaps some time mid-summer you have a moment and are curious to see the lower bound proved (which you have learned enough to understand), and more about large deviations, see these slides from a talk I once gave, pages 5–9. This note on the Cramer-Chernoff theorem is also good. Interestingly, Chernoff says in an interview with John Bather that the Chernoff bound should really be named after someone else!

I have used large deviation theory in my research to analyse buffer overflows in queues. But I know almost nothing about the subject of Random matrices. I prepared the content for Section 24.2 because I was intrigued by the results, wanted to learn, and thought it would be something entertaining with which to conclude. I did some of my reading in Chapter 2 of An Introduction to Random Matrices, by Anderson, Guiomet and Zeotouni, and then simplified their treatment to make it suitable for IA.

I thought it was nice that in showing you these two advanced topics, I could bring into play so many of the ideas we have had in our course: Markov and Chebyshev inequalities, moment generating function, sums of Bernoulli r.vs, Stirling’s formula, normal distribution, gambler’s ruin, Dyke words, generating functions, and the Central limit theorem.

So I hope you will consider the applicable courses in your next year of study. In Appendix H I have written some things about applicable mathematics courses in IB.

## Monday, 9 March 2015

### Lecture 23

Lévy's continuity theorem is the same thing as the continuity theorem given in today's lecture, but for characteristic functions.

Here is a little more history about the Central Limit Theorem.

Henk Tijms writes in his book, Understanding Probability: Chance Rules in Everyday Life, Cambridge: Cambridge University Press, 2004,

"The Central Limit Theorem for Bernoulli trials was first proved by Abraham de Moivre and appeared in his book, The Doctrine of Chances, first published in 1718. He used the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. This finding was far ahead of its time, and was nearly forgotten until the famous French mathematician Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie Analytique des Probabilités, which was published in 1812.

De Moivre spend his years from age 18 to 21 in prison in France because of his Protestant background. When he was released he left France for England, where he worked as a tutor to the sons of noblemen. Newton had presented a copy of his Principia Mathematica to the Earl of Devonshire. The story goes that, while de Moivre was tutoring at the Earl's house, he came upon Newton's work and found that it was beyond him. It is said that he then bought a copy of his own and tore it into separate pages, learning it page by page as he walked around London to his tutoring jobs. De Moivre frequented the coffeehouses in London, where he started his probability work by calculating odds for gamblers. He also met Newton at such a coffeehouse and they became fast friends. De Moivre dedicated his book to Newton."

The Wikipedia article on the Central limit theorem mentions two things that would be suitable for the television programme QI.

1. There is a quite interesting explanation of why the term "Central" is used.

The actual term "central limit theorem" (in German: "zentraler Grenzwertsatz") was first used by George Pólya in 1920 in the title of a paper. Pólya referred to the theorem as "central" due to its importance in probability theory. According to Le Cam, the French school of probability interprets the word central in the sense that "it describes the behaviour of the centre of the distribution as opposed to its tails".

Personally, I had always thought it was the second of these, but the first is also plausible.

2. There is a quite interesting Cambridge connection.

A proof of a result similar to the 1922 Lindeberg CLT was the subject of Alan Turing's 1934 Fellowship Dissertation for King's College at the University of Cambridge. Only after submitting the work did Turing learn it had already been proved. Consequently, Turing's dissertation was never published.

## Friday, 6 March 2015

### Lecture 22

You might like to experiment with the bivariate normal. Here is the Mathematica code that I used to plot the joint density function. You can copy and paste this code into Mathematica and they play with viewing from different angles or changing the value of the correlation, r.

S = ParallelTable[
Plot3D[PDF[
MultinormalDistribution[{1, 1}, {{1, r }, {r , 1}}], {x, y}],
{x, -3, 4}, {y, -3, 4}, PlotRange -> All,
MeshFunctions -> {#3 &}, PlotPoints -> 50], {r, {0, 0.6}}]


You can also get the code here, and for Kelly betting here.