Prime Numbers and Benford's Law
I have to admit, I'm a little confused at all of the recent coverage of a recent paper.
- The title of the paper: "The first digit frequencies of primes and Riemann zeta zeros tend to uniformity following a size-dependent generalized Benford’s law"
- The authors: Bartolo Luque and Lucas Lacasa
- The part of the paper that everyone is talking about: "Benford’s law... describes with astonishing precision the statistical distribution of leading digits in the prime numbers sequence."
See this post for more info on everything (and a nice plot at the end).
The following is why I'm surpised that some people seem to think that this is significant, or will change the face of the public-key crypto (which relies heavily on prime properties).
- We know that the distribution of primes is estimated as x/ln(x).
- Benford's law describes a logarithmic distribution (note that the prime counting function above uses a logarithm).
- So why would it shock people that a logarithmic distribution would follow a logarithmic distribution?
For instance, let's limit ourselves to numbers that are 6 digits long. Here are the estimates for the primes in the follow ranges:
- 100,000 - 200,000 : 7,699 primes that start with "1"
- 200,000 - 300,000 : 7,402 primes that start with "2"
- 300,000 - 400,000 : 7,221 primes that start with "3"
- 400,000 - 500,000 : 7,093 primes that start with "4"
- 500,000 - 600,000 : 6,994 primes that start with "5"
- 600,000 - 700,000 : 6,913 primes that start with "6"
- 700,000 - 800,000 : 6,846 primes that start with "7"
- 800,000 - 900,000 : 6,788 primes that start with "8"
- 900,000 - 1,000,000 : 6,737 primes that start with "9"
Notice how the number of primes keeps decreasing? This is a well understood mathematical property of primes that has been known since the 18th century.
If we sum up all those prime number counts you'll get 63,696, and if you work out the frequencies for each digit you'll get:
- 0.120876251875
- 0.116214422945
- 0.113379591254
- 0.111360316253
- 0.109801974838
- 0.108538831213
- 0.107480273443
- 0.106571485379
- 0.1057768528
Very close to the numbers in http://pyevolve.sourceforge.net/wordpress/wp-content/uploads/2009/05/prime_plot.png.
I wrote a script to calculate this for numbers of arbitrary length (e.g. all 4 digit numbers, or 20 digit numbers, or whatever you want). I ran it for all numbers up to 9 digits in length, just like the above plot and the numbers I got were:
- 0.118536354209
- 0.115044221864
- 0.112888101169
- 0.111337239649
- 0.110131769184
- 0.109149098378
- 0.10832172915
- 0.107608595766
- 0.106982890632
Which are almost exactly the same as the plot from PyEvolve's post.
Here is the plot for the numbers my script calculates:
This plot was generated using the prime counting function discovered in the 18th century. What am I missing about this paper released in 2009 that is deserving of all the attention? (That's not a rhetorical question, if you know I'd like to know too).
