No. 120: Nov-Dec 1998
Back in 1881, Simon Newcomb, the renowned Canadian-American scientist, published a provocative conjecture that was promptly forgotten by everyone. Newcomb had noticed that books of logarithms in the libraries were always much dirtier at the beginning. Hmmm! Were his fellow scientists looking up the logarithms of numbers beginning with 1 more frequently than 2, 3, etc.? It certainly seemed like it. He formalized his suspicions in a conjecture:
p = log10 (1 + 1/d)
Where p = the probability that the first significant digit is d.
This (unproven) equation states that about 30% of the numbers in a table or group will begin with 1. Only about 4.6% will begin with 9. This result certainly clashes with our expectation that the nine digits should occur with equal probability.
Fifty-seven years later, F. Benford, a GE physicist, unaware of Newcomb's paper, observed the same dirty early pages in the logarithm tables. He came up with exactly the same conjecture. Benford didn't stop there. He spent several years collecting diverse data sets -- 20,229 sets, to be exact. He included baseball statistics, atomic weights, river areas, the numbers appearing in Reader's Digest articles, etc. He concluded that his (and Newcomb's) conjecture fit his data very well. There were notable exceptions, though. Telephone directories and square-root tables didn't support the conjecture.
Interestingly, the second digits in numbers are more equitably distributed; the third, even more so.
Mathematicians have never been able to prove the Newcomb-Benford conjecture. How could they if it doesn't apply to all tables? Nevertheless, it works for most data sets, and that's still hard enough to explain.
(Hill, T.P.; "The First Digit Phenomenon," American Scientist, 86:358, 1998.)