Recording the World

They could listen to everything you say and keep it forever.

I speak not as a rabble rouser but as a guy who uses speech technology in his research.  We have the technical capability to record every phone call in the world and store it on disk for later access.  Once it is on disk, we have the capability of sifting through it to find conversations containing interesting words or phrases.

After that, it becomes a matter of politics, patriotism and queasiness.  One hopes (and it is largely true in civilized countries) that nothing much happens.  Someone might listen to it and decide that you’re not really a terrorist and that’s the end.  Or, maybe your name might persist in some database somewhere and you might lose your chance at some job that requires a security clearance.

But even in civilized countries, information occasionally gets passed under the table and the police or the FBI get told “Take a close look at X and see if there is anything he’s doing that’s illegal,” with the implication that someone would be pleased if it were true.  And, of course, in less civilized places (like Iran, these days), paramilitary thugs could get the word, and you would be dragged off.  Given that leading Iranian clerics have called for the repression of and even the execution of protestors, the thug option should not surprise anyone.

But I don’t want to focus on the politics.  I want to focus on the technology and the numbers.

Recording your life.

All phone calls have been digitized since 1980 or thereabouts.  Digital transmission allows much better sound quality and more efficient transmission.  So, capturing your telephone speech is a solved problem.

But, what about storing it?  Suppose you are on a fairly typical cellular/mobile phone calling plan with 200 minutes per month of talk time.  That means, over your whole life, you’d talk for about 3000 hours.  Just to make the problem more challenging, we’ll also toss in a wired phone with 500 minutes per month.   Overall, that means you produce about 10,000 hours of digitized speech over your life.  In terms of data, that’s 268 Gigabytes (standard telephone quality speech is 8000 samples per second, 1 byte per sample, you can check my math).  I probably don’t need to point out that on your computer at home you have about that much storage, if not more.

As of this month, you can buy 1 Terabyte (1000 Gigabyte) disks for about £50 (or $80), which means you can store all of your own telephone speech on about $30 worth of disk.  Two terabyte disks are getting cheaper now, so next year you’ll be able to do it for $15.

In fact, it’s even cheaper than that.  Speech is compressible because the same speech sounds appear again and again.  One can squeeze out that redundancy and store the data more compactly.   For instance vowel sounds are made by the larynx vibrating periodically, opening and closing about 100 times per second.  Each of these openings produces a similar puff of air and a similar bit of sound.  So, you can do a fairly good job of describing a vowel by simply describing one of these puffs in detail, and then simply reporting the number of similar puffs.

Recording “this sound repeated 24 times” is a lot more compact than recording “this sound, this sound, this sound, this sound, this sound, this sound, …, this sound.”  So, in fact, it’s not 268 gigabytes, but several times smaller than that.  Probably 80Gb would suffice, which is what you get on an old, cheap laptop these days.   And, if bought in quantity, 80Gb of disk storage will only cost about $6 this year and $3 next year.  Of course, there are other costs: electricity, a place to put the disk, et cetera, but none of them are huge.  Electricity, for a year, is under a dollar.

Now, how much do you pay in taxes?   Rather more than $6?

Actually recording everyone in a nation would pose some technical challenges, but nothing that a competent engineer couldn’t solve.  The biggest problem might just be providing air conditioning for the racks and racks of disks.  (These days, the best way of finding a large data center is looking for the large air conditioning units on the top.)  Separating telephone speech from internet data streams is not hard — each data stream is labelled.  Handling  10 million phone calls at once isn’t hard either — you just split them into groups of 1000 and send each group to its own disk.

So, there it is.   Everyone’s voice could be recorded and stored for perpetuity.  What gets done with it then becomes a political question.