Archive

math

Twitter Infinite

Web comic XKCD is great for lots of laughs, and a little learning too. If you want a little less laughing, but lots more learning, there’s always the weekly XKCD What-If. This week it answers the eternal question, “How long would it take to read every possible English language tweet?” First you need to figure out how many valid English tweets there are. The answer is surprisingly complicated.

If you just look at the raw numbers, you have 140 characters per tweet. There are 26 English letters, plus the space. So with 27 characters, you come up with 10200 possible strings of letters and spaces. But then there is Unicode to deal with, which brings the total number up to 10800. That is a 1 followed by 800 zeros. Sounding insane? Not to fear, most of these permutations are meaningless nonsense. To get to the true valid English tweets, we have to apply some information theory.

To find out how many likely valid English tweets there are, we can estimate that value based on the information contained in each letter for aggregated samples. In the mid-20th century, Claude Shannon pioneered several key concepts in information theory. Among his contributions was the discovery that, on average, each letter contains 1.0-1.2 bytes of information. This was based on having test subjects guess on blanked out letters in a sentence. It sounds bizarre, but the compression ratio he predicted holds true when tested.

So where does that leave the math, according to XKCD? In a piece of text with n bits of information, there are 2n different messages possible. So 2140*1.1 equals 2*1046 different English tweets. This calculation is based on unicity distance, a principal in cryptography for predicting variations. Still a big number, but much less than the 10800 figure from before.

Back to the original question: how long would it take to read all of them? Just 1047 seconds, which is 3*1039 years. But that’s just if you read straight through! Let’s say you work a 16 hour day reading tweets out loud before retiring for a night of fitful sleep. At that rate, the heat death of the universe would have happened several duodecillion years before you finished the task.

XKCD creates a handy new unit of time to make this easier to grasp. Imagine each day was 1032 years long (again, that’s a 1 followed by 32 zeros). It would take ten thousand years made up of these super-long days to finish reading all those tweets. In short: lots of tweets and lots of years. It’s math!

Now read: Algorithm will tweet in your place after you die