So far of the building blocks for computer systems we have covered processors and memory. We have seen that processors have become massively faster and that memory has become massively cheaper. Today we will learn about storage and find that it has become massively bigger. Throughout this discussion we will use a somewhat eclectic reference point: the length of that great American novel “Moby Dick” which clocks in at 1,203,686 bytes (counted off the text version on DailyLit where you can read it via email in 260 installments) or about 1.15 MB (where 1 MB is 1 Mega Byte or 1,048,576 = 2 ^ 20 Bytes).
The goal of storage, as compared to memory, has always been to store many more bits and to trade off slower speed of access for higher capacity and much reduced cost. In the early days of computers that meant punch cards, which actually go back as far as textile looms in the 18th century. Here the trade off is quite extreme. The cards were very cheap – essentially the cost of paper (with some mark up). But they were also really slow with read speeds of a few hundred cards per minute. Since cards contained about 64 bytes each, that’s considerably less than 1 KB/second (where 1 KB is 1 Kilo Byte or 1,024 Bytes). To store all of Moby Dick would have required 18,807 cards and reading it in would take about 1 hour!
The first big advance was magnetic tape. While much could be written about magnetic tape storage, I am highly partial to the great hack that came along with the Apple II: storage on cassette tape. Cassette tapes were reasonably cheap at something like a few bucks per tape. Due to a very inefficient way of “encoding” the data, a 30 minute tape held only about 300 KB. Still we are down to only 4 tapes for holding Moby Dick but we have actually slowed down by a factor of 2 and it would take 2 hours to read Moby Dick in from tape! Commercial tape storage solutions had much more capacity and were much faster, achieving transfer rates of about 10 KB/s so that it would take only 2 minutes to read in all of Moby Dick. The biggest problem with tape though was not its speed or capacity but that access was essentially sequential. If you wanted to read data in say the middle of the tape you had to forward the tape to that position first before being able to read the data.
The real breakthrough came with magnetic disk storage which to this date holds the bulk of all data in computer systems (although so called Solid State Disks or SSDs are making meaningful inroads). Magnetic disks used to come in two versions: floppy and hard. Today we no longer use the floppy kind, but I will never forget when my parents got me the Apple II 5.25 floppy drive for Christmas (thanks Mom and Dad!). The drive was by today’s standards outrageously expensive. It held only 115 KB initially per floppy and later 140 KB for almost $600 and way more than that in Germany. So now we are back to actually needing 9 floppy disks for holding Moby Dick, but we can read it in at much higher speed. Unfortunately despite a fair bit of poking around I haven’t been able to find just how fast the Disk II was, but suffice it to say it was much faster than the tape!
Now what we mostly have storing data are no longer floppy disks but hard drives. In a hard drive the magnetic disk is permanently mounted in place and as a result can rotate much faster and be made to hold more information. Over the last couple of decades the growth in capacity of these hard drives has been nothing short of astounding. Today you can buy a 1 TB hard drive at CDW for just $59. Now to put that in perspective 1 TB = 1 Tera Byte or about 1,000 GB, or about 1 Million MB – in other words over 1 million copies of Moby Dick fit onto that drive! And the speed is blazing too. A computer can read data from this disk at a rate of some several hundred MB/s. At that speed, you can read in several hundred copies of Moby Dick in just one second.
But our need for storage has been growing just as fast if not faster. For instance, Youtube is adding an amazing 50 hours of video every minute! Now 1 hour of video can take up as much as 80 GB of storage. Let’s assume that because of lower average quality on Youtube every hour takes up only 20 GB. Then Youtube is still adding 1 TB of video every minute. Without any redundancy (meaning storing only a single copy), that means Youtube needs to add an extra 1 TB every minute. Using $50 as an approximation based on the CDW price above, that would be 525,600 minutes / year * $50/ TB = $26 million for additional hard drives alone. Just for fun, how many punch cards would this require in a year? Somebody may want to check my math, but I get something like 8 quadrillion cards. That’s an 8 followed by 15 zeros!
In upcoming Tech Tuesdays, we will learn more about how hard disks work and what that implies for computer systems. But for now the key message to take away is that we have entered the age of nearly limitless storage.