Tech Tuesday: Computational Complexity (Introduction)

So far in the Tech Tuesday series on the theory of computer science we have covered questions of computability. We have looked at ways of formalizing computation through abstract machines such as the Finite State Machine and the Turing Machine to explore limits on what can be computed at all. We saw with the Halting Problem that the power to compute lots of things is gained at the cost of not being able to say much about the computation (e.g. will it stop?).

Now we will turn our attention to a different question known computational complexity. We will look at algorithms known to solve a problem and ask: how long does this program run? And how much space will it consume? These questions might have seemed more relevant at a time when computers had 48 KB of main memory and CPU speed was about 0.5 MIPS (the specs of the Apple II on which I learned how to program). Today I am writing this blog post on a MacBook Air with 8 GB of memory (a 160,000x increase) and an Intel Core i7 with over 100,000 MIPS (a 200,000x increase). So who cares about the time or space required to execute a program?

To understand why we still care, consider a seemingly simple problem: guessing a string of length 10 drawn from the letters A, G, C, T. Since there are 4 letters that amounts to 4 ^ 10 different possible combinations, which is the same 2 ^ 20 or about 1 million combinations. Let’s generously assume it takes only 1 instruction to check a combination then we are looking at 1 million instructions which my MacBook Air could do in roughly 1/100,000th of a second (even if you make the number of instructions more realistic you will still be in the sub millisecond range).

Now consider the very same problem except that you are trying to guess a string of length 40. You are looking at 4 ^ 40 or 2 ^ 80 combinations. Since 2 ^ 10 is about 10 ^ 3, this is 10 ^ 24 checks. Again assuming only 1 instruction per check we are now looking at a running time of 10 ^ 24 / 10 ^ 11 = 10 ^ 13 seconds which is about 400,000 years (or as Wolfram Alpha helpfully points out about 20 times since the last glacial maximum). If you are looking for a string just a tad longer than that you can take up more time than we believe the universe has existed.

Now if you noticed my choice of letters you will realize that I am talking about DNA. A string of length 40 is tiny in the DNA world where even small organisms such as bacteria have north of 10 ^ 6 base pairs. There are many other naturally occurring phenomena where any brute force approach would results in exponentially growing running time. So despite our tremendous advances in hardware we still care a great deal about the running time and space requirements of an algorithm.

In the upcoming posts we will learn a lot more about the time and space characteristics of different algorithms for the same problem and also different types of problems. Along the way we will discover some great counter-intuitive surprises and find out more about the limits of our knowledge.