Tech Tuesday: Of Bits and Bytes (Binary Number System)

Before we can go on and explore the building blocks in more detail, we need to learn a little bit about the fundamental underlying language used in computers: the binary number system. Based on my kids’ school, this appears to be a 6th grade math topic that’s apparently taught without any context. I am hoping I can do better here.

The numbers that we use day-in day-out are based on the decimal system. In the decimal system we use ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. It is *not* a coincidence that the word digit can also mean a finger or a toe. At least one reason why we use the decimal system is because we happen to have 10 fingers (and 10 toes). Using each finger to correspond to 1 unit of something is very intuitive. If you go into a bar and raise one hand with all fingers extended, you are ordering 5 beers or asking for 5 seats.

Computers don’t have hands but they have switches instead (imagine tons of tiny light switches), each of which can either be on or off. So instead of ten digits, computers only have two digits: 0 and 1. You can think of 0 corresponding to a light switch in the off position and 1 to the light switch being in the on position. The name binary comes from fact that the system has only two digits. A single switch is what we refer to as a “bit.”

A bit is the smallest unit of information. Instead of a light switch you could also think of a bit as a Yes/No answer. Are you understanding this? Yes = 1, and No = 0. You might want to give a more differentiated answer and we will get to that in a second but if you want to provide any information at all you, have to be able to at least distinguish between two different possibilities: understand / don’t understand. That’s what one bit allows you to do.

As it turns out, we can use sequences of bits, i.e. sequences of 1s and 0s to represent all different kinds of information, including numbers, text, images, sound. How does that work? It’s easy for numbers. Each bit represents some power of 2 (1, 2, 4, 8 , 16, …), just like in the decimal system each position of the number represents some power of 10 (ones, tens, hundreds, thousands …). So for instance the number 42 (decimal) would be 101010, which stands for 1*32 + 0*16 + 1*8 + 0*4 + 1*2 + 0*1 = 42.

As a fun aside, using the binary system you can use your two hands to count all the way to 1023. Just have each finger represent one bit. When your finger is extended, the bit is 1, when it is curled the bit is 0. All 10 fingers extended is 1111111111 which gets you to 1*512 + 1*256 + 1*128 + 1*64 + 1*32 + 1*16 + 1*8 + 1*4 + 1*2 + 1*1 = 1023. Now take a second and figure out what 132 would look like. You will know if you got it right as will everyone around you (note: not entirely safe for work).

On computers it has become the norm to group bits into groups of 8 bits which together are known as a “byte.” One byte can hold the numbers from 00000000 which is of course also 0 in decimal to 11111111 which is 1*128 + 1*64 + 1*32 + 1*16 + 1*8 + 1*4 + 1*2 + 1*1 = 255. That doesn’t seem like a lot but if you combine multiple bytes the values start to increase quite quickly. For instance, I am writing this on a MacBook which has a processor that handles 64 bits at a time, which is 8 bytes. If you take 64 bits and they are all set to 1, that is a gargantuan number 2 ^ 64 - 1, which you can check over a Wolfram Alpha is 18,446,744,073,709,551,615. Yup – that’s 18 quintillion or 18 billion billions (you know what’s cool? a quintillion!).

Now that we can represent numbers it’s also easy to represent text. All we need is to have a number for each letter of the alphabet. Of course it would help a lot if every computer used the same set of numbers for the letters to make moving text from one computer to another easier. One of the early and longlasting mappings from text to numbers was known as ASCII, which stands for American Standard Code for Information Interchange. In ASCII, for instance the number 65 represents the letter A. The number 66 represents the letter B. You can find the full set here.

The text string “Hello World” is represented by the following ASCII number sequence

72 101 108 108 111 32 87 111 114 108 100

which is in turn represented by the following series of bits (arranged as bytes)

01001000 01100101 01101100 01101100 01101111 00100000

01010111 01101111 01110010 01101100 01100100

These days on the web we don’t really use ASCII much any more other than in ASCII art because we need to be able to represent letters/symbols from international alphabets. That has given rise to newer standards for encoding text as numbers, such as UTF-8 (more on that some future time).

Finally, a simple example of how bits can be used to represent an image. Think of a really old fashioned CRT display (as you would find in an old movie) where each dot on the screen is either a phosphorus green or dark. For computer screens we call these dots “pixels." Representing an image back then was easy. We would use 1 bit for each pixel. If the bit was 1 the pixel would be lit up and if it was 0 it would stay dark. Now of course we have displays with millions of colors for each pixel and so we need many bits to represent each pixel but the basic principle remains the same (if you have ever fiddled with your display settings you may have run across the term bit depth or color depth).

Hopefully by now you have gotten the idea that having just 0s and 1s available to us at the deepest level of computers is really not meaningfully limiting in what types of information computers can process as we can use a bunch of bits to represent all sorts of different things.

Also, if you paid attention throughout all of this, you can now understand this geek humor t-shirt and figure out what the 4-dot logo of our portfolio company 10gen stands for.

For extra credit: read about hexadecimal numbers which will make an appearance in a later Tech Tuesday.