Saturday, July 21, 2012

Density of Math language

I'm sure this is a common problem among mathematicians reading mathematical text. If we read a bit too fast we glaze over. This also happens when reading novels and other text but with math, it happens much quicker. Of course, this has to be because math is denser than the English language. Unravel the meaning of any mathematical expression and it becomes clear. The solution to the problem is quite simple - we need to spend more time reading the mathematical expressions themselves. This post could easily end here but it’s more interesting to explore further.

There are a few ways to read mathematical formulas. The first and most obvious is to read them the way we would say them if we were dictating to someone or the way we would input them into a computer program, or $\LaTeX$. Then we might read them the way we would communicate them to a fellow mathematician, presumably short-handed somewhat. Finally, we might combine math words into more complicated objects which would themselves be treated as single words in our brain. Let’s try an example – the Fundamental Theorem of Calculus. We have

$$ \int_a^b f’(x) dx = f(b) – f(a). $$ 
Here are different ways we can read this 

- The integral of the derivative of f with respect to x from a to b equals f of b minus f of a (23 words)
- Integral, a to b, of f prime of ex dee ex equals f  of b minus f of a (19 words)
- Integral, a to b of f prime equals f b minus f a (13 words)

We can become very picky with the way we assign a number of words to a mathematical expression. One can easily just look at the way $\LaTeX$ math formulas works and assign a word for each symbol or operation. For example, the expression above goes by \int_a^b f’(x) dx = f(b) – f(a) in $\LaTeX$. Counting \int, _, a, ^,b, f,‘,(), x, dx, =, - as one word each, we arrive at 19 words, which is the same as the second interpretation. I will use a mix of this approach the the one with which we would communicate to others.

As an example, we will take the proof of a theorem from an analysis textbook and count the number of math words in it. It is the proof of theorem 17.1, page 148 of “Analysis with Introduction to Proof” by Steven R. Lay, $3^{rd}$ edition.

To get an idea for how my counting works, here are a few examples. 

"$ \lim (s_n + t_n) = s + t $" is read as "limit of s sub n plus t sub n equals s plus t", which is 13 words. 

"$|s_n t_n – s t| = |(s_n t_n – s_n t) +  (s_n t – s t)|$" is read as “absolute value of s sub n times t sub n minus s times t, equals, absolute value of bracket s sub n times t sub n minus s sub n times t plus bracket s sub n times t minus s times t”, which has 43 words. 

"$|t_n – t| < \epsilon/2M$" is read like “absolute value of t sub n minus t, is less than or equal to epsilon over two em”, which has 18 words. 

The process is tedious so that one has to go line by line tallying up all the numbers. The numbers are striking. There are only 160 English words in the entire proof that goes on for more than a page. There are about 350 in-line math words hidden inside mathematical symbols and about 450 such words off-line, for a total of 800. Coincidentally, this is exactly 5 times the number of English words in the proof. It's becoming clear why we can easily glaze over when we read a proof like this.

We can take this idea further. Each page seems to have about 45 lines of text. I scanned about 20 different lines of text to count the number of words in each line and estimated that the average number of words in this math textbook is about 13.92 per line. This comes to about 626.4 words per page of block text, on average. The proof takes about 1.2 pages of text. If we accumulate the English words into a block of text, we would have about 0.26 pages of text. If we do the same for the math words, we would get 1.27 pages. The total for the proof is about 1.53 pages of solid text, which is only somewhat larger than the current proof. It's only slightly longer than the actual proof, and this is due to formatting. Off-line formulas take up a lot of space area but the actual symbols are dense. I propose the following rule of thumb: "Treat the empty space around math expressions as text. Pay that much more attention to these areas."

Just for fun, define the mathematical word index of a proof to be the number $r$

$$ r = \frac{m}{m+w} $$

where $m$ is the total number of words hidden in mathematical expressions over the work we are looking at, and $w$ is the total number of actual English words in the same section. For this proof, $r \approx 0.83$. This proof is 83 percent math. Well, actually, the proof is 100 percent mathematics but this is beside the point.

Let’s look at a proof or two just for fun. Here is a proof of the density of the rationals inside the real numbers, using the Archimedian property of the natural numbers.

12.12 Theorem (Density of $\mathbb{Q}$ in $\mathbb{R}$) If $x$ and $y$ are real numbers with $x < y$, then there exists a rational number $r$ such that $x<r<y$.

Proof: We begin by supposing that $x > 0$. Using the Archimedian property, there exists an $n \in \mathbb{N}$ such that $n > 1/(y-x)$. That is, $nx+1 < ny$. Since $nx > 0$, it is not difficult to show (Exercise 12.9) that there exists $m \in \mathbb{N}$ such that $m - 1 \leq nx < m$. But then $m \leq nx + 1 < ny$, so that $nx < m < n$. It follows that the rational number $r = m/n$ satisfies $x < r < y$.

Finally, if $x \leq 0$, chose an integer $k$ such that $k > |x|$. Then apply the argument above to the positive numbers $x +k$ and $y+k$. If $q$ is a rational satisfying $x +k < q < y+k$, then the rational $r  = q - k $ satisfies $x  < r < y$. QED


Alright. Going line by line (of the blog entry) the number of english words is

$$ w = 10 + 9 +12 + 6 +5 + 11 + 11 + 4 = 68. $$

The number of mathematical words is

$$\begin{aligned} m &= 4 + (3 + 9 + 10 + 6) + 3 + (12 + 15 + 9) + (5 + 7) \\ &\quad\quad+ (6 + 6) + 6  + (11 + 5  +7) \\ &=  124\end{aligned}$$

where in the brackets I've counted the words in each expression on the line to make less mistakes. This gives us a ratio of $ r = 124 / ( 124 + 68) = 0.6458333$ or roughly 65 percent.

1 comment: