Tuesday, February 21, 2012
Tech Tuesday: Web Servers
If you have been following along, we are now on Step 5 of the web cycle where we find ourselves at the server that is answering an HTTP request. Because the HTTP protocol is well defined anyone in theory can implement a web server. In practice these days most people run one of less than a handful of servers with Apache, IIS (Microsoft) and more recently nginx accounting for the bulk of all web sites. The reason for this degree of concentration is that much like database software, the web server is a mission critical piece of the stack and a lot of work has gone into making sure these servers work well for a wide variety of uses.
Let’s first consider the simplest possible situation: a GET request for a URL where the resource on the server is a file. In this case all the web server needs to do is read the file from disk and send it back packaged up as an HTTP response. That means sending an HTTP status code of 200 (assuming the file was found and properly read), followed by a bunch of headers indicating the type of the response (indicating for instance whether the file contained text or HTML or an image), followed by the actual contents of the file. If the file is not found, the server would return a response code of 404. Or if the server finds the file but cannot read it for some reason it might return a response code of 500.
Even this relatively simple task of answering a GET request for a file in reality is a bit more complicated because the HTTP protocol has a bunch of important optimizations. Imagine a situation where a great many browsers all request the same file over and over. It would be very inefficient to actually send an unchanged file back again and again. So instead the HTTP protocol allows for a couple of different mechanisms, such as the cache-control or etag to determine between the browser and the server whether a resource (here the file) has changed and needs to be served anew. If based on this the web server determines that it does not need to resend the file, it will send a 304 Not Modified HTTP status code instead.
Now things get a fair bit more complicated when the web server has to deal with the submission of data via a POST request. In general, the web server needs to do a bunch of work to figure out how to respond. The response will generally depend on the data that was submitted with the form. Web servers therefore provide mechanisms for invoking a program and passing the submitted data to that program (which might be written in a language such as PHP, or Python, or Ruby, or pretty much any other language for that matter). The program can then determine what to do based on the data and dynamically assemble the response. The web server then passes that response back to the browser.
Web servers do a great many other things, such as imposing load limits and doing URL rewriting, but this should give you the basic outline.
Tags: tech_tuesday web_server web
Tuesday, February 7, 2012
Tech Tuesday: Routing and TCP/IP
We are continuing along with the web request cycle. Last week we took a look at the HTTP protocol. There I already mentioned that HTTP requests and responses travel over a TCP/IP connection. Today we will dive a bit deeper into TCP/IP. This is technically not really necessary for understanding the request cycle because these lower levels of the network are completely abstracted away when you develop for the web (which is a fancy way of saying you get to use it without worrying about how it works). Yet, peeling the onion a bit further will turn out to be very useful to the overall understanding of how things work on the web.
In the Tech Tuesday on networking, I introduced the idea that the Internet is a packet switched network. As a refresher this means that data gets cut up into packets. The IP layer is responsible for how these packets move across the network. What follows is quite a bit of a simplification but good enough for our purposes here. Each packet (sometimes also referred to as a datagram) has its own header which contains among other things the source and destination IP addresses. These packets travel between machines along flexible paths known as routes. There is a tool called traceroute for examining what these routes are and it is worth trying this out.
On a Mac, use Spotlight to find and start the “Terminal” application. You will get a new window with a prompt which lets you type commands (this is known as the command line and we will learn a lot more about it in a future Tech Tuesday). Type “traceroute google.com” and you will see output that looks something like the following:
1 192.168.1.1 (192.168.1.1) 1.987 ms 0.864 ms 0.794 ms
2 10.32.128.1 (10.32.128.1) 9.576 ms 8.510 ms 7.638 ms
3 gig-0-3-0-7-nycmnya-rtr2.nyc.rr.com (24.29.97.130) 7.983 ms 8.371 ms 8.123 ms
4 tenge-0-5-0-0-nycmnytg-rtr001.nyc.rr.com (24.29.150.90) 12.007 ms 12.481 ms nycmnytg-10g-0-0-0.nyc.rr.com (24.29.148.29) 14.716 ms
5 bun6-nycmnytg-rtr002.nyc.rr.com (24.29.148.250) 18.132 ms 11.899 ms 12.706 ms
6 ae-4-0.cr0.nyc30.tbone.rr.com (66.109.6.78) 7.120 ms 8.395 ms 8.113 ms
7 ae-4-0.cr0.dca20.tbone.rr.com (66.109.6.28) 13.161 ms 66.109.9.30 (66.109.9.30) 14.679 ms ae-4-0.cr0.dca20.tbone.rr.com (66.109.6.28) 13.992 ms
8 107.14.19.135 (107.14.19.135) 14.153 ms 12.694 ms ae-1-0.pr0.dca10.tbone.rr.com (66.109.6.165) 14.154 ms
9 66.109.9.66 (66.109.9.66) 15.230 ms 74.125.49.181 (74.125.49.181) 13.553 ms 66.109.9.66 (66.109.9.66) 13.315 ms
10 209.85.252.46 (209.85.252.46) 17.017 ms 14.467 ms 209.85.252.80 (209.85.252.80) 15.536 ms
11 209.85.243.114 (209.85.243.114) 26.926 ms 209.85.241.222 (209.85.241.222) 25.348 ms 25.406 ms
12 216.239.48.103 (216.239.48.103) 25.799 ms 64.233.174.87 (64.233.174.87) 25.046 ms 216.239.48.103 (216.239.48.103) 32.101 ms
13 * 209.85.242.177 (209.85.242.177) 40.436 ms *
14 vx-in-f103.1e100.net (74.125.115.103) 25.568 ms 26.283 ms 26.659 ms
Each one of these lines represents a so-called “hop” — meaning packets traveling between two internet devices. The first hop is from my computer to my home switch. The second hop is from there to my home VPN device which is connected to a cable modem from Time Warner. From there the packets travel over a whole bunch more intermediate switches and routers until the get to a server operated by Google. You can try this with other servers as well, such as “traceroute www.dailylit.com” — if the output get stuck with lines containing just “* * *” instead of information on hops, then you can terminate the process by pressing Ctrl-C. For those of you on Windows, here is how to run a traceroute.
Now the really important part to keep in mind about the IP level of the protocol is that it is strictly best efforts. This means that packets can travel different routes, can get dropped and can arrive out of order at the destination. So how in the world do we get an HTTP request and response across such a fundamentally unreliable network? Well that’s where the TCP portion comes in. TCP the Transmission Control Protocol sits on top of IP and provides for guaranteed in-order delivery of packets. How does it do that? Well, the details are complicated, but for our purposes it is sufficient to understand that it starts with a fair bit of initial “handshaking” (back and forth) where the two endpoints (sender and receiver) agree on what they will do. Once that “connection” has been established it becomes possible to keep track of which packets have been received and which have not and to cause packets that might have been dropped to be resent.
What are some of the takeaways here? First, having fewer hops will make things faster. If you try different servers with traceroute, you will see that a lot of servers are more hops away than Google’s — Google has invested heavily in shortening the paths to their servers. This is also what so-called CDNs or Content Delivery Networks do. They bring content (e.g., images) closer to the “edge” of the network so that requests have fewer hops. Second, setting up a TCP connection involves a fair bit of overhead. In the first version of HTTP each request required a new connection which was very inefficient. With HTTP 1.1 a single connection is kept alive for a sequence of requests and responses (a session). But there is still a separate connection required for each different server and so a web page that connects to many different resources incurs more overhead. Third, if you really want a lot of speed it helps to reduce the number of packets that need to be sent. In the early days, the entire home page of Google was optimized to fit into a single package.
Tags: tech_tuesday web networking
Tuesday, January 31, 2012
Tech Tuesday: HTTP
Today we are continuing on with the web request cycle. After the browser has parsed the URL and obtained the IP address of the server via DNS, the browser now has to communicate with the server. That is done using the so-called Hypertext Transfer Protocol or HTTP for short. The beginnings of HTTP go back to the early 1990s when Tim Berners-Lee first devised it drawing inspiration from Ted Nelson, who had coined the term Hypertext in 1963. For an even earlier description of a similar idea it is worth reading Vannevar Bush’s amazing “As We May Think” from 1945!
HTTP builds on top of the lower level Internet protocol TCP which permits establishing a connection between two machines (see my introduction to networking). A so-called HTTP session consists of a series of requests from the browser followed by responses from the server. Each request consists of a request method, a resource (URL), a set of headers and optionally a request body.
The most common HTTP request methods are verbs such as GET, POST, PUT and DELETE (I am capitalizing them because that’s how they appear in the protocol). What’s great about these is that they are wonderfully descriptive of what you expect the request to do. GET is supposed to, well, get information from the resource. I say resource rather than server because that is the right level to think about with regard to HTTP — it is about manipulating abstract resources rather. PUT on the other hand puts information at the resource (without regard to what’s already there). DELETE — you get the idea — deletes the information at the resource. This relative obviousness and some associated expectations around how these methods behave provides a powerful foundation for the transfer of information (more on that in a future post on so-called RESTful APIs).
The headers contain additional information about the request. For instance, the “Date” header field contains the date and time when the request was sent. Or the “Referer” header (misspelled in the protocol and in most implementations!) contains the URL of the page on which the currently requested resource was found. It is worth looking at the list of possible HTTP headers, which also shows the headers for a response (see below). It should be pointed out that the HTTP protocol allows for the creation of additional headers which can carry custom information (not always what you would want as in the recent case of O2 sending users’ phone numbers!).
The request body is used for POST and PUT requests to carry the data. For instance when you encounter a registration form on the web that asks for your name and email address, the information you type into the form fields is (generally) carried in the body of the resulting HTTP request.
Once the server has received and processed the request it will send an HTTP response. The response has a structure that’s quite similar to the request. Instead of the method, the server returns a status code, then some headers, and finally a response body.
The status code indicates what happened at the server and hence what to expect in the body of the response. The standard code is “200 OK” which means the server processed the request and everything went well. There are more precise responses in that vain, such as “201 Created” which means the server created a new resource (e.g. in response to a PUT request). There are a series of codes to deal with resources that have moved, such as “301 Moved Permanently” which provides a new URL that should be used for all future requests for this resource. And there are a bunch of codes to indicate various error situations such as the famous “404 Not Found” for which some web site return very funny contents in the body of the response. Again, it’s worth browsing the complete list of response codes.
The response headers contain a lot of additional information about the response. For instance the “Content-Type” header field describes what kind of content the response body contains. Examples of different values for this header field are “text/html; charset=utf-8” for a web page in HTML using the UTF-8 character set or “image/jpeg” for an image that is compressed using JPEG. Without knowing this the browser would have to infer the content type from inspecting the body of the response which would be very cumbersome. There are a ton more headers in a response that are similarly critical to the proper functioning of the HTTP protocol, such as how long a recipient can “cache” (locally store) the body (in order to help speed up a subsequent access and also relieve the server and network).
Finally there is the body of the response which contains the actual information. The body is a bunch of bytes. What they represent can vary wildly as explained above. It could be an HTML web page or an image or something altogether different. One of the great powers of the HTTP protocol is that it is really content agnostic.
Because there is a lot going on with the HTTP protocol under the hood and much of it matters it is a bit of a shame that many people including active developers don’t really understand it and as a result either create things that don’t work as expected (e.g. making resources change in response to a GET request) or re-invent features on top of HTTP that HTTP already contains (e.g., content caching). If you do any work on the web it is well worth digging deeper than this post!
Tags: tech_tuesday web http
Tuesday, January 24, 2012
Tech Tuesday: DNS
Today we are continuing with the web cycle that I outlined two weeks ago. After a URL has been parsed in Step 1, the browser needs to determine the IP address for the domain as Step 2. Reprising the previous example, let’s consider the domain name dailylit.com. How does the browser determine that in order to retrieve information from this domain it should access a server at IP address 72.32.133.224? This is accomplished via a system called DNS, which stands for Domain Name System, and provides essentially the equivalent of a telephone book which provides IP addresses (telephone numbers) for domain names (people names).
In ARPANET, the predecessor to the Internet, there were so few domain names that this telephone book was simply a file called HOSTS.TXT that was retrieved from a computer at SRI and stored locally. There were only a few domains (mostly universities) and the file was relatively short. Today on the Internet there are over 200 million domain names of the type dailylit.com, which are further subdivided through subdomains such as blog.dailylit.com. So the idea of having every computer maintain a complete and up-to-date copy of the telephone book locally doesn’t make sense any more.
Thankfully in the early 1980s, which depending on your perspective is either ancient pre-history or not that long ago, DNS was born as a service that would allow the registration of domain names and maintain a mapping between the names and IP addresses in a robust fashion. In fact, without DNS it would be hard to imagine the Internet having grown as dramatically and we probably wouldn’t have nearly as many domains to begin with.
There are many ingenious ideas in the design of DNS and I won’t be able to cover them all here. Instead, I will focus on some key concepts. The first and central one is that there is a hierarchy of authority which allows for the delegation of both registration of domain names and the lookup of IP addresses. The hierarchy starts with the 13 root servers which together make up the so called root zone from which all authority flows. It is here that the so-called Top Level Domains or TLDs get resolved. Going back to blog.dailylit.com, the TLD is the “.com” part. You can think of a domain name like nested Russian dolls, where the outermost doll, the TLD, is the rightmost part of the name.
The most common TLDs are .com and .net which together account for about half of all domain names. There is of course also .org, .gov., .edu and an ever increasing number of other TLDs such as most recently .xxx. And then there are TLDs for countries which all consist of two letters, such as .uk for the UK (duh) or .ly for Libya, popularized by bit.ly, and .us for the US, which made the domain del.icio.us possible. Each TLD has one or more registrars associated with it who are in charge of letting people and companies reserve names in that domain.
The root servers point to name servers for each of these TLDs. Since blog.dailylit.com is in the .com domain the next place to look is the .com name servers. The .com name servers in turn point to the name servers for dailylit.com itself. Currently those name servers are at Rackspace. Since Susan and I registered and control dailylit.com, we are the ones who get to decide which nameservers should be queried to find the IP address for dailylit.com and its subdomains, such as blog.dailylit.com. The way this generally happens is by logging into a system run by a registrar and setting which nameservers are to be the authoritative sources of IP addresses for the dailylit.com domain. That then gets recorded in the nameserver for the corresponding TLD.
The lookup process that started with the root, went to the .com TLD, is now at the dailylit.com nameservers at Rackspace. They in turn contain information on dailylit.com itself and its subdomains, such as blog.dailylit.com. The whole process of starting at the root and working towards the subdomain (right to left) in a series of separate lookups across different servers is called a “recursive lookup.” If this sounds complicated to you, that’s because it is. It is so complicated and resource intensive that we don’t want the web browser to have to do this each time it encounters a domain name. It would not only be slow, but it would also swamp the root servers, the TLD servers and possibly even the name servers for dailylit itself.
So instead of doing a recursive lookup every time, the results of these lookups are stored on so called DNS cache servers. For instance, most ISPs through which you access the Internet will operate their own cache servers. After they have looked up blog.dailylit.com once, these servers will “cache” (meaning temporarily store) the result of the lookup, thus providing a much faster lookup the next time. In fact, your own computer will often cache the results of lookups locally for super fast access. This is important both because even a single web page generally involves multiple requests (e.g. for images) to the same server. The duration for which the results of a recursive lookup can be cached locally is known as the Time To Live or TTL and is controlled by the owner of the domain (and generally honored by the cache servers).
The existence of cache servers (sometimes also referred to as non-authoritative servers — although technically not exactly the same) provides a critical security vulnerability for DNS. Let’s say you have gone to your favorite coffee shop and logged on to the WIFI network there. Where do your domain lookups go? Well to the cache server of whatever ISP the coffee shop uses or possibly even cache servers on the coffee shop’s own network. An attacker with access to those local cache servers could insert falsified records that could have the effect of say pointing chase.com to some rogue server that wants to steal your bank username and password. This would allow for a so-called man-in-the-middle attack (more on this in a future post). Fortunately, some security additions to DNS known as DNSSEC will in the future prevent these kinds of attacks. As more and more of our access to the Internet is over wireless networks this becomes particularly important.
If you made it this far, I hope you have a (newfound) appreciation for the complexity of a system that is used billions of times per day behind the scenes of nearly every access to the Internet. In addition to the technical issues there are also important political issues surrounding DNS. Most recently the proposed SOPA and PIPA legislation would have mandated nameserver operators to make changes that would have interfered with the implementation of DNSSEC. Then there is also the question as to who really controls the root zone which turns out to be the US Department of Commerce. Yes, for the *entire* Internet, which is all the more reason why we should make DNS better not worse.
Tags: tech_tuesday web dns
Tuesday, January 17, 2012
Tech Tuesday: Anatomy of a URL
Last week’s overview of “How the Web Works” introduced the URL (Uniform Resource Locator) as the fundamental way things are addressed on the web. Before we pick apart some actual URLs, it is worth looking at the name itself. The promise behind “Uniform” is that this addressing scheme can be used across all kinds of resources and that explains why URLs are so powerful - they can be used to address content such as a blog but also services such as the Twilio telephony API. On the web a blog entry and an incoming phone call are both simply “resources”. That means a resource is a highly abstracted concept and as you will learn if you stick with Tech Tuesday, abstraction is amazingly powerful. And on the web the URL is the most powerful abstraction of them all!
So here is a URL to pick apart:
http://blog.dailylit.com/2012/01/16/in-honor-of-dr-martin-luther-king-jr/
The very first part, the “http:” indicates which protocol to use to access this resource. What other protocols might we find there? The obvious one is https: the secure (meaning encrypted) version of http:. Here some other protocols that you may have encountered around the web “mailto:” which indicates that the resource that follows is an email address and the protocol to speak to it is SMTP or you may have seen “ftp:” for resources that are accessible via File Transfer Protocol (FTP). Another protocol supported by many browsers is “file:” which means that the resource that follows is a file on the machine on which the browser is running.
Following the “http:” are two forward slashes “//” — these indicate that this URL starts with a domain name, which in this case is “blog.dailylit.com” — we will dissect domain names in more detail in the Tech Tuesday on DNS. There we will investigate the relationship between domains and actual servers but for now it is worth pointing out that grouping resources by domain serves an important trust purpose. Your expectations about accessing content at chase.com are meaningfully different from wepretendtobechase.com. Of course it’s not always that obvious and people go to great lengths to pretend to be someone else. There is a good test of your knowledge of which domains to trust.
Following the domain name is the location of the resource within that domain. This is the “/2012/01/16/in-honor-of-dr-martin-luther-king-jr/” part in the URL above. There are several things going on here that are worth noting. First, this location is structured in an easily human readable and comprehensible form. Just by looking at the URL you can infer that this is a post about Martin Luther King on MLK day. We call this kind of location a “pretty URL.” Having pretty URLs is a good idea not just because it helps humans figure out what they are likely to get when they access the resource but also because search engines, especially Google, make pages with pretty URLs rank higher in search results (assuming that the page content actually appears to be a match for the URL).
But there is even more to a pretty URL like “/2012/01/16/in-honor-of-dr-martin-luther-king-jr/” — the slashes “/” in the URL indicate some notion of hierarchy or of a path to the resource. It also suggests that the following shorter URL should point to something useful http://blog.dailylit.com/2012. In fact this retrieves all the blog posts from 2012. There is no requirement that the domain fulfilling the request understand this shorter URL, but the fact that it does corresponds both with intuition and allows for additional degrees of automation and discovery. For instance, without any further knowledge you should be able to construct the URL for finding all the blog posts from November 2011. Here it is http://blog.dailylit.com/2011/11/ . Again, there is no requirement on the server to respond to this with a list of posts and it could instead respond with say a 404 Page Not Found. The http protocol does not speak to this, which is one of its many strengths as it lets the person or organization controlling the resource decide how to respond.
Now not every URL starts with a “//” — there are also URLs that don’t contain a domain but instead just a path to a resource. Consider for instance the following http:/ — where the resource pointed to by this URL is located depends on the context in which it is encountered. This is an example of a relative URL. It points to a resource within the context of another resource. If you are reading this in the context of the Tumblr dashboard, the link will take you to your dashboard. If you are reading this on my blog, which is at the domain “continuations.com” it will take you to the home page of my blog. Relative URLs allow for more compact expression of the location of a resource but they can also introduce interesting errors. For instance, think about what resource that relative URL will point to if you simply copy it and send it to someone via email and they open it in a web mail client!
This post is getting quite long and I haven’t yet covered fragment identifiers or query strings. Instead of going on, I will cover fragment identifiers in the context of HTML and query strings when describing how URLs can be used to transmit additional information that can be used by the server in deciding how to respond to the request to the resource, so keep following Tech Tuesday!
Tags: tech_tuesday url web
Tuesday, January 10, 2012
Tech Tuesday: How The Web Works (Overview)
As promised at the end of last year’s Tech Tuesday, we are starting this year with a cycle on how the web works. Just as a reminder, Tech Tuesday’s aim is to require no previous knowledge other than what has been covered before. So this overview may be trivial for some readers but I wanted to make sure to bring everyone along.
Let’s assume you have fired up your favorite web browser. Now you type the address “dailylit.com” into the address bar (if you always go to web sites by typing their name into a search engine, I urge you to discover the address bar and type in the address).
What happens now? How does the browser go from a web address for a site to that site’s content on your screen? That turns out to be an amazingly complex series of steps:
Step 1: The address “dailylit.com” is part of what is known as a URL. The full URL is “http://dailylit.com” and your browser automatically pre-pends the “http://” to save you the typing. The HTTP bit indicates to the browser which protocol to use to speak to the server (more on that in Step 3 below). Typing a URL into the address bar starts the same sequence of steps as if you had clicked on a link (e.g. among a set of search results) pointing to the same location as in DailyLit. In the very first step the browser “parses” the URL (meaning it takes apart the URL into its various parts) in order to determine where it is supposed to look for content.
Step 2: The “dailylit.com” portion of the URL is the domain name (you can get your own from a domain registrar). Think of this much like the name of a person. If you want to call a person on the phone you need to look up their phone number based on their name in some phone book (e.g. the contact list on your cell phone or in the dark ages some paper book made from dead trees). Similarly in order for your browser to retrieve the content from DailyLit, it needs to first lookup the IP address of the server on which the content lives. This is done by consulting a “phone book” known as DNS which stands for Domain Name System and is a near miraculous invention.
Step 3: Now that the browser has an IP address, in the case of DailyLit currently 72.32.133.224, it makes a request to “GET” content from 72.32.133.224. GET is capitalized and in quotes here because it is one of several defined requests supported by the so-called Hypertext Transfer Protocol or HTTP — which was what the beginning part of the URL. In essence this request simply says GET me the content that resides at 72.32.133.224. This is the protocol that got started with Tim Berners-Lee’s work in the very late 80s and early 90s and is to this day the underpinning of the interaction between web browsers and web servers.
Step 4: Through the magic of Internet networking, that GET request is routed via a whole bunch of intermediary devices (routers, switches, firewalls, load balancers oh my!) to the machine with the IP address. In fact, you can look up for yourself how many intermediate hops exist between you and the server and that’s something we will do in an upcoming Tech Tuesday
Step 5: We are now on the Server. The server receives the incoming GET request. On the server machine the work is co-ordinated by a program known as a web server (something like Apache or NGINX). The web server retrieves the contents for the page and starts sending them back to the browser again over the Internet. Important side note: because the Internet is packet switched, the content is cut up into smaller parts (packages) that may travel different routes to get back to the browser. All of that cutting up and re-assembling is handled by lower levels of the network and is transparent to both the web server and the web browser. That too is one of the many awesome features of the Internet that we easily take for granted.
Step 6: Back at the browser. The browser is receiving the content in the form of an HTTP Response. That response contains a bunch of different stuff. For instance, it contains a so-called HTTP Response Status Code to indicate to the browser whether the server thinks it has some useful information [Response Code 200 OK]. If the server had a problem, e.g. it didn’t find have any content for this URL it will send a different code, such as the famous 404 Page not Found. The browser needs to start parsing the response to figure out what to do next. That will in all likelihood include many additional requests by the browser to the same and possibly other servers to retrieve content that was referenced in the initial response, such as CSS and Javascript files. Every one of these additional requests involves all the steps from 1-6 AGAIN!
Step 7: Even while it is still waiting for the responses from these additional requests (and possibly even more pieces of the original request) to arrive the browser will start to figure out how to render the content that it has received on the screen. That means figuring out what to show where, which is made incredibly complex by the interaction between the HTML (roughly: the content itself), the CSS (roughly: the styling or look and feel of the content) and the Javascript (roughly: the dynamic behavior of the content). This work involves a so-called rendering engine and also a full fledged computer language interpreter (for Javascript).
Step 8: The browser continues to execute the Javascript code (which might, for example, animate an object to move across the page) while at the same time waiting for input from you. For instance, when you hover with the mouse above a link that might change the look and feel of that link. In the early days of the web, the most that would happen now is that a click on a link will start the whole process over at Step 1 for the next page. Today, however, many additional requests to the web server may occur without the page ever refreshing as new content is dynamically fetched and added to the existing page and other content written back to the server.
In the upcoming Tech Tuesdays we will look at each of these steps in some detail, starting with the anatomy of a URL next Tuesday. In the meantime, I hope I have managed to convey some of the amazing complexity that is involved in something that we now take for granted and people interact with billions of times every day around the world. And all along you should keep in mind that I haven’t even mentioned any of the complexity behind the scenes, such as the browser interacting with the computer’s operating system to make all of these steps happen.
Tags: tech_tuesday web overview
Tuesday, December 20, 2011
Tech Tuesday: Survey Says!
Thanks for everyone who participated in the Tech Tuesday survey. I learned a lot. First off, not surprisingly, my audience overall is fairly tech savvy with 2/3rds knowing at least how to code up HTML and over 1/2 having done some programming.

Second, I was happy to find that the level of difficulty was about right and that if anything I should maybe add some harder “bonus” material.

Third, post length seems spot on with nearly 80% liking it as it is and roughly 10% each asking for shorter and for longer, so I will stick with roughly the existing length.

Fourth, it appears that all three proposed topics are of essentially equal interest (more on that claim below). The only thing that is clear is that all three of these are in fact good topics for my audience as only 10% wanted to hear about something different altogether.

Now I am going to restart Tech Tuesday next year with a sequence on Web Technologies. It does seem like a timely topic that would be useful for everyone including Congress to understand. Not that I expect to have members of Congress in the audience, but once the series is up, I will include a link to it whenever I write to one of my representatives.
So I claimed above that there really isn’t a meaningful difference between the votes for each of the topics. How do I figure that? Let’s leave out the “other” category and focus on the three remaining ones. There were 78 respondents for those categories, which conveniently happens to be a multiple of 3. A perfectly even distribution would be 26 for each of the three possibilities which would clearly provide no signal about preference.
So how much of a preference signal am I getting from the actual voting outcome of 24, 26, 28? One way to think about this is as follows. If people had no preference and were simply throwing darts, how often would the outcome show a difference of 4 or more between two topics? Now there is a mathematical way of figuring that out, but that’s a bit involved. Instead, I wrote a short piece of code to run a Monte Carlo simulation (you can look at the HTML for this post and it includes the Javascript code).
I could probably use a better random function than what’s built into Javascript but this should work as a first cut. When you run the code, you will see that around 87% of the time this size difference occurs “naturally” on a purely random selection among 3 topics. So not a lot of signal here!
Tags: tech_tuesday survey results
Tuesday, December 13, 2011
Tech Tuesday: Asking for Direction
So I got up earlier than usual this morning to work on a post about more details on how CPUs work and some actual assembly language. In particular I was planning to introduce the notion of registers and maybe the stack (all of this using a neat web-based 6502 emulator).
But then it occurred to me that I really don’t have a good sense of whether this is what folks want to read about. So I figured I would provide a quick recap instead and then lay out some possible directions.
So far Tech Tuesday has covered:
1. An Overview of Building Blocs
2. Of Bits and Bytes (Binary Number System)
3. A First Look at the Central Processing Unit (CPU)
4. Main Memory (Dumb, Lazy and Slow)
5. Storage (Oh My, How It Has Grown)
6. No Computer is an Island (Networking)
7. Input/Output (Interrupts and Queues)
8. Operating Systems (Making It All Work)
9. Programming (A Start)
From here there are a great many places to go. So let me describe three different possibilities for what’s next and see if there are any strong preferences.
Programming Basics: This would take a language such as Javascript and introduce basic programming concepts such as variables and controls structures (e.g. branching, loops). Would probably leverage Codeacademy.
Lower Level Programming: This is sort of where I was headed before deciding to ask for direction. The sequence would start by taking a look inside the CPU and its registers. We would then examine some assembly code and work our way up towards programming in C.
Web Technologies: This would be a series of posts covering what HTML is, how the HTTP request cycle works, and how domain names are resolved to IP addresses. I might also throw in some CSS and Javascript.
I am planning to cover all of these topics and many more eventually so this is not a question about whether or not any of these should be part of Tech Tuesdays, rather what to cover next. So to get help with this I am trying to get a better sense of the audience currently reading Tech Tuesdays and what your are all interested in. There are only four questions. The more people answer, the better I can make the series.
Tags: tech_tuesday reader_survey
Tuesday, December 6, 2011
Tech Tuesday: Programming (A Start)
Maybe I should have started the whole Tech Tuesday series with a post on programming since that’s why computers were created in the first place! In fact, thinking about programming in many ways precedes the availability of actual computers to carry out those programs. At the time that Babbage was dreaming up his Analytical Engine, Lady Ada started to formulate how a general purpose machine would be programmed. That was almost 100 years before the first truly programmable machines were actually built! Much closer to that date but still before he had access to a computer, Alan Turing in 1936 described an abstract machine (the Turing machine) that he then proved could compute anything a computer can do no matter how fast or complex a CPU, how much memory, etc it has (aside: that does not cover what a quantum computer might be able to do if we ever figure out how to make one work).
So what does it mean to program a computer? Somewhat flippantly: programming is telling the computer what to do. But given the pieces that we have put in place we can define programming more precisely as: creating a set of instructions that the CPU can execute to achieve a desired outcome. That outcome might be the computation of a number, the animation of an object on the screen, the manipulation of a text or — and this is the beauty of programming — pretty much anything else one can dream up. In the process of executing the program, the various parts of the computer work together as specified by the program. Data will move around memory and maybe to and from storage. If necessary, I/O devices will be activated. Possibly data will be sent or received via a network.
How does a programmer go about creating the set of instructions? In the early days of computers this literally involved hand picking instructions from the CPU’s instruction set and manually encoding these so that they could be fed to the CPU. But because of the work of theorists and the desire of visionaries we rapidly wound up with programming languages that were more easily accessible to humans and could then be translated by the computer itself into the instructions for the CPU. One such vision had always been to program a computer using simply spoken language and with Siri and Android voice actions we now have that as a reality — people are quite literally telling their phone what to do.
Whenever you program a computer in anything other than the actual machine code (the bytes that represent the instructions and addresses) you are using some kind of programming language. So called Assembly Language is barely above machine code. It is mostly a set of acronyms for the instructions with some ability to refer to program and memory locations by a name as opposed to an actual address. A program called an assembler is used to translate assembly language into machine code. Because writing assembly is really picking instructions by hand it takes a long time to write programs but affords the ultimate control over what code is actually executed which can be important for some cases, such as parts of a device driver.
Anything that’s more expressive than assembly is generally referred to as a higher level language. Among higher level languages there is still a huge range though from a language such as C which is closest to the machine end to a language such as Prolog on the other (Prolog deals with logical expressions). Higher level languages require some form of translation into machine code. This is handled by programs known as interpreters and compilers. As a first cut you can think of the difference between an interpreter and a compiler as the difference between having a simultaneous translator and a translated book. Essentially an interpreter reads the higher level language as it comes along and figures out what to do whereas a compiler takes one or more passes over the entire higher level language program.
In order for an assembler, interpreter or compiler to be able to do their work, the expressions in assembly or in the higher level language have to follow specific patterns which are known as syntax. That is of course even true when programming a computer in natural language in the Siri example above. If you say something completely ungrammatical, Siri will not know what to do.
I have found programming to be a deeply satisfying activity and will write lots more about it in upcoming Tech Tuesdays. When programming I can spend many hours without noticing the passage of time at all. Part of the satisfaction for me comes from how programming is a craft that combines writing and analysis/math in a wonderful way. But part of it also comes from the amazing amount of control I can exercise over machines which contrasts sharply with the many limits on control in the rest of our lives!
Tags: tech_tuesday programming
Tuesday, November 29, 2011
Tech Tuesday: Operating Systems (Making It All Work)
I ended last week’s Tech Tuesday on Input/Output saying I might write next about programming in assembly language, but it has since occurred to me that I should cover at least one other high level topic first: operating systems. That seems more logical since the operating system (OS) is what makes all the parts of a computer system work together. In fact, I alluded to that last week in the context of the device drivers that are used by the OS to get data to and from peripherals such as a keyboard or a monitor.
Without an OS any computer system is really just a bunch of parts that are fairly useless by themselves. The OS handles such critical tasks as scheduling which program to run, reading from and writing to files (in storage), communicating over networks, accessing peripherals. Today most people have at least a couple of different operating systems in their lives because in addition to their personal computer they also have a smart phone — which is really just a portable computer. You might have a Windows laptop and a Blackberry OS phone. Or an OS X Macintosh and an iOS iPhone or iPad. Or a Linux Laptop and an Android phone. Or some other combination of these or some other OS altogether. And of course when you are accessing servers you are talking to another set of OSes such as Windows Server or Linux or some UNIX variant. Along the way some of the networking gear, which are really custom computers, have their own OS, such as Cicso’s IOS (yup just off by a capitalization) or Juniper’s JUNOS.
Now if you look at the list of things that the OS is responsible for at the beginning of the previous paragraph, you may wonder — if the OS is responsible for scheduling which program to run and if the OS deals with reading files, then how in the world does the OS itself get into the computer and start running when I turn the computer on? That turns out to be an important question. In the early days there was a simple answer: the OS was stored in so called read only memory (ROM) and when the computer started the program counter of the CPU pointed to the beginning of the code for the OS. ROM is a type of memory that retains its contents even without power but cannot ever be changed (written), hence the name.
But a modern OS is huge and also changing relatively frequently with new versions being released, so storing it all in ROM doesn’t work. Instead, what happens is that a tiny bit of code is loaded with the sole purpose of then loading more code which then loads the operating system from disk. That process is known as booting, which is derived from boot strapping which carries with it the wonderful image of pulling oneself over some obstacle by one’s own boot straps (or in the case of the Baron von Muenchhausen pulling oneself out of a swamp by one’s pigtail which in turn reminds me to write a post about Terry Gilliam some day). That tiny bit of code that gets loaded at the very beginning is known as the boot loader and was referred to in the comments on the previous Tech Tuesday!
If you are still skeptical (as you should be based on the Muenchhausen reference) you might say, but how does the boot loader get data from the disk? Didn’t you say last week that disk IO is handled by the OS? Ah, I failed to mention that thing known as the Basic Input Output System (BIOS). It is a bunch of code that does in fact sit in ROM or these days often Flash memory. It allows for primitive input output usually with a monitor (just characters, no graphics), a keyboard, a disk, and more recently also the network. Without that “hard wired” code the whole boot process could not work.
Every modern operating system consists of two fundamentally different parts. The so-called kernel and everything else. As the name suggests, the kernel is central and is where all the really hard stuff happens, such as switching between different programs or writing data to an output device. By everything else I mean programs such as utilities for searching through files (e.g. grep). I wrote this distinction intentionally somewhat vague, because if you look through the Wikipedia page on Kernels you find that a lot of OS design and development work has gone into figuring out what to put into the Kernel. No matter where the line is drawn, the programs that end users run on top of the operating system are all allowed to only access memory in what is known as user space or sometimes user land, which is kept separate from kernel space. Only code executing inside the kernel can access kernel memory.
Kernel programming is some of the most difficult programming there is. At Harvard there was an amazing class called CS 161 in which students essentially wrote an OS kernel from scratch. CS 161 even made it into the Social Network. For anyone truly interested in learning how an OS works nothing compares to going through that process and I would highly recommend getting Andrew Tanenbaum’s Operating Systems: Design and Implementation and start coding away (and/or get a copy of Minix to play with). In looking for a link to CS 161, I am somewhat dismayed to see that it seems like CS 161 was last taught in 2009. Can this really be true?. Maybe because it was most horrifically time intensive.
Tags: tech_tuesday operating_system
← Older Entries