Posts tagged web

Tech Tuesday: Web Browser (Part 2)

Today is the wrap up of the initial web cycle.  Last Tech Tuesday, I introduced Javascript as determining the behavior of a web page and interacting with HTML (the content of the page) and CSS (the look and feel).  Now it makes sense to revisit the Web Browser, which needs to put all those pieces together to actually display the web page and let the user interact with it. By now it should come as no surprise that this is in fact quite a difficult job and that the web browser is a rather elaborate piece of software.

Most of us spend a big chunk of our day using a web browser and take it pretty much for granted that we can go to any URL and have the contents appear quickly.  But as we have seen in a first step the web browser has to assemble all the pieces for the page, which generally involves many HTTP requests to retrieve different files containing various bits of HTML, CSS, Javascript and images.  Once all the pieces are in place, the really tough part starts, which is figuring out what should actually be on the screen.

Why is this tough?  Because everything interacts with everything else. Styles can be found mixed in with the HTML and in separate files and there can be multiple declarations that all need to be applied to the same element.  Furthermore, the Javascript can dynamically change not just which styles should apply but can even add new HTML elements on-the-fly.  Yes, the HTML that was retrieved from the web server can be changed by the Javascript.  And of course once a new HTML element is introduced that needs to be styled too.  And a styled HTML element can wind up pretty much anywhere on the page where it can (partially or completely) overlap other elements.

Once you start to look under the hood like that you get to a point where you realize that it’s a small miracle that any of this works at all.  The combination of technologies is incredibly powerful but also incredibly messy behind the scenes.  And given that complexity it should not come as a shock that pages often look slightly different in different browsers.  Yes there are standards but those have been evolving and browser developers have to make choices when they implement these.  It is infinitely easier to critique the particular choices made than to build a well working web browser.

To help deal with these cross-browser differences fortunately people have come up with so-called libraries and frameworks that mostly hide these issues from the designers and programmers building web pages.  Recently some developers from Twitter have taken different frameworks and pulled them together into something called Bootstrap which brings together standardized pieces of HTML, CSS and Javascript (including the popular jQuery library) to make it easy to create beautiful and functional pages across not just different browsers but also different devices (computer, tablet, phone).

That’s it for our initial pass over the web cycle.  Now I need to decide whether to tackle some more involved web topics, such as the Document Object Model or to return to more general computing topics.  As always, suggestions are welcome!

Posted: 27th March 2012Comments
Tags:  tech tuesday web browser

Tech Tuesday: Javascript

The topic of last week’s Tech Tuesday was CSS, which determines the look and feel of the content of a web page (where the content itself is described in HTML).  Today we will learn about Javascript, which is a programming language that lets us control the behavior of the web page (both by itself and in reaction to what the user does).  Together HTML, CSS and Javascript determine what a web browser does in Steps 7 and 8 of the web cycle.

What is a programming language?  It is a language that lets us tell a computer what to do which is really all that programming is.  Javascript was originally developed specifically for telling a web browser what to do, but it is a completely general programming language and has more recently been used to program servers.

Let’s jump into a simple example that will allow us to illustrate the basic idea.  Say we have a web page with a list of headlines.  For each headline we also have a summary of the story but we only want to show that when someone clicks on the headline so that we can show more headlines and make the page less cluttered.  Now we could make each headline a link and set off a web request cycle to go to a new page for that headline but that would be slow and clumsy.  As an alternative we could put the headlines at the top of the page and the summaries below and use URLs with fragment identifiers to jump to the summary but that too would make the page scroll around like crazy.

With Javascript we have another alternative.  We can keep each summary below its headline but have the summary be hidden and show it only when someone clicks on the headline.  How would that work?  Let’s first take a look at how we might structure the HTML, i.e. the content itself:

<h1>First Headline Goes Here</h1>
<p class="summary" id="first">First summary goes here</p>
<h1>Second Headline Goes Here</h1>
<p class="summary" id="second">Second summary goes here</p>

This looks very similar to the HTML we have seen with some extra attributes thrown in which we will use in a second.  In order to keep the summaries hidden when the page first loads, we will use some CSS as follows

.summary { display: none }

The .summary is a CSS selector, that picks out all the elements on the page that have the class=”summary” attribute set.  By styling these elements with “display: none” we get the web browser not to display them.  But they are still part of the HTML and so they are sitting right there on the page, just not visible.

Now we need to sprinkle on a tiny bit of Javascript to let us make a summary visible when the headline above it is clicked.  Here is what that might look like:

<h1 onClick="document.getElementById('first').style.display='inline';">First Headline Goes Here&lt/h1;>

We have added a so-called Javascript event handler to the h1 element. I will explain how this works step by step.  onClick is an attribute that lets us set the Javascript code to be executed when the text of the h1 element is clicked by the user.  The code itself consists of a single expression which modifies the value of the CSS display property of the summary by setting it to the new value “inline”.  As soon as that happens, the web browser takes the previously hidden summary and displays it.

We can unpack the Javascript expression a bit more.  Document refers to the HTML  (technically it refers to something called the Document Object Model or DOM but we will get to that in a future Tech Tuesday).  The getElementById(‘first’) then gives us access to the element with the specific id attribute of first, which is the <p> containing our first summary.  And finally .style.display refers to the display style of that element.

Now this is not what you would actually do on a production web site, but you get the basic idea.  In reality you would use a library such as jQuery to deal with issues of browser compatibility and to write more elegant code that doesn’t have to be repeated for each summary.  Finally, much like with CSS you would not throw the Javascript right into the HTML but separate it out into its own file which the browser requests separately from the web server. 

Now if you want to learn Javascript, I highly recommend you head on over to Codecademy (a USV portfolio company) where you can jump right in using just your web browser. It will take you through a series of simple lessons that have you writing code from the very first moment.  Next Tuesday, I am planning to wrap up the web cycle by revisiting the web browser.

Posted: 20th March 2012Comments
Tags:  tech tuesday web javascript

Tech Tuesday: CSS

In last week’s Tech Tuesday we learned about HTML which is used to describe the content of a web page. Today, we will inspect something called Cascading Style Sheets (or CSS for short) which determines what that content looks like. All of this belongs to Step 7 of the web cycle where the web browser takes the information retrieve from a web server to display the page to the enduser.

For our purposes today we will look at a three item list.  You may recall that the list is defined by the <ul> and </ul> tags and each item is enclosed with <li> and </li> tags.  Without providing any additional information, the list will display something like this:

  • First list item
  • Second list item
  • Third list item

Now that’s a pretty boring looking.  What if we wanted a list without bullets and instead have each item in a blue box with white text?  How would we tell the web browser to display our list that way? In order to do that we need to apply so-called “styles” to the HTML elements.

In the early days of web design and web browser technology (pre-CSS), there was already some support for styling elements but it was done by adding “style” attributes to the HTML elements themselves.  So for instance, I can get white text on a blue background by adding a style=”color: white; background-color: blue” attribute to each of the <li> elements in the list. I would love to show what this looks like, but as it  turns out, I can’t insert a style attribute in the post (Tumblr’s HTML editor throws these attributes out).  So a description of how it works will have to do.

Now there are several things wrong with the approach of adding style attributes directly to the HTML elements (known as an “inline” style).  First, all those style attributes would really clutter what your HTML looks like and make it much harder to read.  Second, what if you later want to change what the list looks like?  You would have to go back and change the style attribute for each list element.  Now imagine a web site with hundreds of lists on it.

So that’s where CSS comes in.  CSS allows style information to be separated out from the HTML using a very simple yet powerful syntax: one or more selectors are followed by a list of properties and their values.  Huh? I will explain each of these in turn.  A selector determines which HTML elements the style should apply to.  So for instance, the “li” selector applies to all <li> elements.  A property is something like “color” or “background-color” and the value is what you want that property to be, e.g. “white.” The list itself is enclosed in parentheses and properties are separate by semi-colons.

So here is the CSS expression that would make every list item have white text on blue background:

li {color: white; background-color: blue}

But where does that expression go? There are two possibilities.  We could embed it in the HTML by enclosing it with <style> tags as in <style>li {color: white; background-color: blue}</style>.  That would still mean that for a site with many pages we would have to copy this style section and add it to every page.  Thankfully we can instead separate all the styles out into their own file and only include a reference to that file as follows

<link rel="stylesheet" href="http://example.com/css/style.css" type="text/css" />

Generally these references to external style sheets go into the <head> section of an HTML page.  The web browser then makes a separate HTTP request to retrieve the style sheet.  Now we can gather up all of our styles in one place and just add this reference to every page.

Now despite its deceptively simple syntax CSS is incredibly powerful and can dramatically change the look and feel of a site.  For instance, all of Tumblr themes work via CSS.  For a dramatic illustration you can head over to the CSS Zen Garden and explore a single site with dramatically different look and feel all based on CSS. The reason this is possible is because of the many different types and combinations of element selectors supported by CSS.

If you want to learn more about how to actually style a list there is a great introduction that ends with a list with white text on blue background (as I had promised above).  Next Tech Tuesday we will look at how to add behaviors to a web page using Javascript, which is a programming language.

P.S. Curious why it is called “Cascading”? Because you can have multiple CSS selectors referring to the same HTML element resulting in a “cascade” of styles for that element.

Posted: 13th March 2012Comments
Tags:  tech tuesday web css

Tech Tuesday: HTML

Over the next three Tech Tuesdays we will cover the three essential technologies that together make up the bulk of most web sites: HTML, CSS and Javascript.  This is Step 7 of the web cycle where the web browser uses these three (which were all retrieved from one or more web servers) to construct a web page.  The easiest way to think of how these three make up a page is as follows: HTML is the content of the page, CSS is the look-and-feel of the page and Javascript is the behavior of the page (i.e. how the page acts on its own and in response to events such as mouse clicks).

HTML stands for HyperText Markup Language and it was part of Tim Berners-Lee's original design for the World Wide Web.  As a markup language, HTML consists of a series of tags which “mark up” the content, i.e., the tags indicate what the different parts of a page are.  For instance, here is a list in html:

<ul>
    <li>First list item</li>
    <li>Second list item</li>
    <li>Third list item</li>
</ul>

The tags are the parts enclosed in the so-called “angle brackets” - the “<” and the “>”.  The “<ul>” tag starts the list (the “opening tag”) and the “</ul>” tag ends it (the “closing tag”).  Every list item starts with “<li>” and ends with “</li>”.  You can see here that generally speaking HTML consists of opening and matching closing tags.

There are tags to describe various content elements such as headings, paragraphs and lists.  There are also tags that relate directly to the HTTP protocol.  Most important of these is the hyperlink.  Here is an example:

<a href="http://www.dailylit.com">Visit Dailylit!</a>

Here you see the opening tag contain a so-called attribute with the name href.  The value of the href attribute is the URL of that the link points to.  The text which will be displayed for the link is “Visit Dailylit!”  Here you can see the whole thing in action: Visit Dailylit!

So far I have shown what are called HTML fragments — pieces of HTML that together form the contents of a page.  The overall structure of an HTML page looks roughly as follows:

<html>
    <head>
        <title>Hello World!</title>
    </head>
    <body>
        <h1>Hello World!</h1>
        <p>This is a simple HTML page.</p>
    </body>
</html>

This is a very simple page that illustrates the basic structure, including the head and body sections.  The title in the head section is what shows up as the title of the window or tab when the page is displayed in a web browser.  The body section has the actual contents of the page itself.

There are quite a few different tags to learn about, such as tags to describe forms with labeled input fields, tags for showing images, and more.  You can learn more about the available tags at W3Schools where you can try them out right in your browser.  Next Tech Tuesday we will look at how CSS can be used to change the look and feel of the content described by the HTML.

Enhanced by Zemanta

Posted: 6th March 2012Comments
Tags:  tech tuesday web html

Bringing Time Back to the Web (Or: The Struggle for Depth)

Evan Williams apparently recently said that there is an issue with all of us being stuck in a kind of “continuous present” on the web (ironically, I can’t find that quote right now).  I am certainly stuck in that all powerful present many days.  There is so much new output hitting the web every day that one can barely scratch the surface of it, let alone delve into the past. Google has only aggravated this problem by tilting their search algorithm more heavily towards recency.  Techmeme — one of my daily go-to sites — only aggregates the day’s output.

The power of the present is another example of a type of “filter bubble.”  And just like I have called for an “opposing views reader”, what we need to do is surface time explicitly.  I am not a fan of Facebook by any means, but timeline may turn out to be an important contribution to the future of the web.  Similarly there is something quite magical about Timehop as a way of bringing our own past back to us.  Just the other day my Timehop email reminded me that a year earlier we had picked up a dog from a shelter.

Now imagine a version of Techmeme that links today’s topics to their historical precedents using a kind of timeline view.  Or think of a search engine that adds a time dimension to the results navigation — so that instead of having to explicitly ask for older content you can just “scroll” into the past.  Thinking about this has given me a whole new appreciation for the importance of what Brewster Kahle and the team at the Internet Archive are working on.

PS The thoughts here were inspired by an interesting conversation I had yesterday with Nick Hasty from Rhizome which has another interesting archive in Artbase (thanks, Nick).

Enhanced by Zemanta

Posted: 2nd March 2012Comments
Tags:  time web search present past

Tech Tuesday: Web Browser (Part 1)

We have now reached step 6 of the web cycle and are back at the web browser.  The web browser is receiving an initial HTTP response from the web server. The first part of this response tells the web browser is whether or not the web server is delivering some useful information.  That happens via the response status, which will be 200 OK if the HTTP request is being properly fulfilled.  The second part are additional so-called response headers which provide information about the content that is contained in the response.  And the third part is the actual content.

For today, we will examine the case where the content consists of a web page in HyperText Markup Language or HTML. We will have an entire Tech Tuesday on just HTML in the future but for today we will focus on just one aspect of HTML which is critically important to understanding the web cycle: the (frequent) need for additional requests to be made to the same and/or other web servers. Leaving aside the “how” for a moment, HTML describes what is on a web page.  On my blog here the page consists mostly of text, but in the sidebar titled Shapeways, you are seeing several images of 3D printed objects.

When your web browser made its first request to this page, the HTML that came back did not also contain those images!  Instead, it contained information that told your web browser to go to a different URL from the URL of the page and request those images there.  In fact, every image that you see on a page (generally) requires a separate HTTP request to fetch the data for that image.  Each of those requests invokes the entire web cycle that we are just going through.  Now I just counted and an initial request to the frontpage of Continuations, spawned an additional 53 (!) requests to get all the pieces needed for that page.

How did I determine that?  I use Google Chrome as one of my web browsers and it has a set of developer tools built in that let you inspect what happens when a page is loaded.  To see these go to the “View” menu and select “Developer Tools” from the “Developer” sub menu (these are the Mac OS X instructions but I am assuming Chrome for Windows has a similar menu). This will give you an additional window.  Select the “Network” button, go back to your primary browser window and reload the page.  You should now see something similar to this:

This is a list of all the requests the browser is making to get the pieces necessary for this page.  For each request you can see what was requested from where and how long that request took to complete. Apple’s Safari web browser has a similar developer menu built in as do the latest versions of Firefox.

What is the take away from today?  Even before it gets around to figuring out how to display all the content that it has received, a web browser does an awful lot of work just sending out requests to receive all the various pieces of a web page.  These requests are sent more or less in parallel (as the image above shows) because if they were sent one after the other, the page would take a very long time to load. In turn each of these requests hits a web server somewhere that needs to respond.  All of that runs over the network.  We take the fact that a web page loads quickly almost for granted these days and yet fundamentally the fact that it loads at all given everything that goes on is quite remarkable.

Enhanced by Zemanta

Posted: 28th February 2012Comments
Tags:  tech tuesday web browser

Tech Tuesday: Web Servers

If you have been following along, we are now on Step 5 of the web cycle where we find ourselves at the server that is answering an HTTP request.  Because the HTTP protocol is well defined anyone in theory can implement a web server.  In practice these days most people run one of less than a handful of servers with Apache, IIS (Microsoft) and more recently nginx accounting for the bulk of all web sites.  The reason for this degree of concentration is that much like database software, the web server is a mission critical piece of the stack and a lot of work has gone into making sure these servers work well for a wide variety of uses.

Let’s first consider the simplest possible situation: a GET request for a URL where the resource on the server is a file.  In this case all the web server needs to do is read the file from disk and send it back packaged up as an HTTP response.  That means sending an HTTP status code of 200 (assuming the file was found and properly read), followed by a bunch of headers indicating the type of the response (indicating for instance whether the file contained text or HTML or an image), followed by the actual contents of the file.  If the file is not found, the server would return a response code of 404.  Or if the server finds the file but cannot read it for some reason it might return a response code of 500.

Even this relatively simple task of answering a GET request for a file in reality is a bit more complicated because the HTTP protocol has a bunch of important optimizations. Imagine a situation where a great many browsers all request the same file over and over.  It would be very inefficient to actually send an unchanged file back again and again.  So instead the HTTP protocol allows for a couple of different mechanisms, such as the cache-control or etag to determine between the browser and the server whether a resource (here the file) has changed and needs to be served anew.  If based on this the web server determines that it does not need to resend the file, it will send a 304 Not Modified HTTP status code instead.

Now things get a fair bit more complicated when the web server has to deal with the submission of data via a POST request. In general, the web server needs to do a bunch of work to figure out how to respond. The response will generally depend on the data that was submitted with the form. Web servers therefore provide mechanisms for invoking a program and passing the submitted data to that program (which might be written in a language such as PHP, or Python, or Ruby, or pretty much any other language for that matter). The program can then determine what to do based on the data and dynamically assemble the response.  The web server then passes that response back to the browser.

Web servers do a great many other things, such as imposing load limits and doing URL rewriting, but this should give you the basic outline.  

Posted: 21st February 2012Comments
Tags:  tech tuesday web server web

Tech Tuesday: Routing and TCP/IP

We are continuing along with the web request cycle.  Last week we took a look at the HTTP protocol.  There I already mentioned that HTTP requests and responses travel over a TCP/IP connection.  Today we will dive a bit deeper into TCP/IP.  This is technically not really necessary for understanding the request cycle because these lower levels of the network are completely abstracted away when you develop for the web (which is a fancy way of saying you get to use it without worrying about how it works). Yet, peeling the onion a bit further will turn out to be very useful to the overall understanding of how things work on the web.

In the Tech Tuesday on networking, I introduced the idea that the Internet is a packet switched network.  As a refresher this means that data gets cut up into packets.  The IP layer is responsible for how these packets move across the network.  What follows is quite a bit of a simplification but good enough for our purposes here.  Each packet (sometimes also referred to as a datagram) has its own header which contains among other things the source and destination IP addresses.  These packets travel between machines along flexible paths known as routes.  There is a tool called traceroute for examining what these routes are and it is worth trying this out.

On a Mac, use Spotlight to find and start the “Terminal” application.  You will get a new window with a prompt which lets you type commands (this is known as the command line and we will learn a lot more about it in a future Tech Tuesday).  Type “traceroute google.com” and you will see output that looks something like the following:

 1  192.168.1.1 (192.168.1.1)  1.987 ms  0.864 ms  0.794 ms
 2  10.32.128.1 (10.32.128.1)  9.576 ms  8.510 ms  7.638 ms
 3  gig-0-3-0-7-nycmnya-rtr2.nyc.rr.com (24.29.97.130)  7.983 ms  8.371 ms  8.123 ms
 4  tenge-0-5-0-0-nycmnytg-rtr001.nyc.rr.com (24.29.150.90)  12.007 ms  12.481 ms nycmnytg-10g-0-0-0.nyc.rr.com (24.29.148.29)  14.716 ms
 5  bun6-nycmnytg-rtr002.nyc.rr.com (24.29.148.250)  18.132 ms  11.899 ms  12.706 ms
 6  ae-4-0.cr0.nyc30.tbone.rr.com (66.109.6.78)  7.120 ms  8.395 ms  8.113 ms
 7  ae-4-0.cr0.dca20.tbone.rr.com (66.109.6.28)  13.161 ms 66.109.9.30 (66.109.9.30)  14.679 ms ae-4-0.cr0.dca20.tbone.rr.com (66.109.6.28)  13.992 ms
 8  107.14.19.135 (107.14.19.135)  14.153 ms  12.694 ms ae-1-0.pr0.dca10.tbone.rr.com (66.109.6.165)  14.154 ms
 9  66.109.9.66 (66.109.9.66)  15.230 ms 74.125.49.181 (74.125.49.181)  13.553 ms 66.109.9.66 (66.109.9.66)  13.315 ms
10  209.85.252.46 (209.85.252.46)  17.017 ms  14.467 ms 209.85.252.80 (209.85.252.80)  15.536 ms
11  209.85.243.114 (209.85.243.114)  26.926 ms 209.85.241.222 (209.85.241.222)  25.348 ms  25.406 ms
12  216.239.48.103 (216.239.48.103)  25.799 ms 64.233.174.87 (64.233.174.87)  25.046 ms 216.239.48.103 (216.239.48.103)  32.101 ms
13  * 209.85.242.177 (209.85.242.177)  40.436 ms *
14  vx-in-f103.1e100.net (74.125.115.103)  25.568 ms  26.283 ms  26.659 ms

Each one of these lines represents a so-called “hop” — meaning packets traveling between two internet devices.  The first hop is from my computer to my home switch.  The second hop is from there to my home VPN device which is connected to a cable modem from Time Warner.  From there the packets travel over a whole bunch more intermediate switches and routers until the get to a server operated by Google.  You can try this with other servers as well, such as “traceroute www.dailylit.com" — if the output get stuck with lines containing just "* * *" instead of information on hops, then you can terminate the process by pressing Ctrl-C.  For those of you on Windows, here is how to run a traceroute.

Now the really important part to keep in mind about the IP level of the protocol is that it is strictly best efforts.  This means that packets can travel different routes, can get dropped and can arrive out of order at the destination.  So how in the world do we get an HTTP request and response across such a fundamentally unreliable network?  Well that’s where the TCP portion comes in.  TCP the Transmission Control Protocol sits on top of IP and provides for guaranteed in-order delivery of packets.  How does it do that?  Well, the details are complicated, but for our purposes it is sufficient to understand that it starts with a fair bit of initial “handshaking” (back and forth) where the two endpoints (sender and receiver) agree on what they will do.  Once that “connection” has been established it becomes possible to keep track of which packets have been received and which have not and to cause packets that might have been dropped to be resent.

What are some of the takeaways here?  First, having fewer hops will make things faster.  If you try different servers with traceroute, you will see that a lot of servers are more hops away than Google’s — Google has invested heavily in shortening the paths to their servers.  This is also what so-called CDNs or Content Delivery Networks do.  They bring content (e.g., images) closer to the “edge” of the network so that requests have fewer hops.  Second, setting up a TCP connection involves a fair bit of overhead.  In the first version of HTTP each request required a new connection which was very inefficient.  With HTTP 1.1 a single connection is kept alive for a sequence of requests and responses (a session).  But there is still a separate connection required for each different server and so a web page that connects to many different resources incurs more overhead.  Third, if you really want a lot of speed it helps to reduce the number of packets that need to be sent. In the early days, the entire home page of Google was optimized to fit into a single package.

Posted: 7th February 2012Comments
Tags:  tech tuesday web networking

Tech Tuesday: HTTP

Today we are continuing on with the web request cycle.  After the browser has parsed the URL and obtained the IP address of the server via DNS, the browser now has to communicate with the server. That is done using the so-called Hypertext Transfer Protocol or HTTP for short. The beginnings of HTTP go back to the early 1990s when Tim Berners-Lee first devised it drawing inspiration from Ted Nelson, who had coined the term Hypertext in 1963.  For an even earlier description of a similar idea it is worth reading Vannevar Bush's amazing “As We May Think" from 1945!

HTTP builds on top of the lower level Internet protocol TCP which permits establishing a connection between two machines (see my introduction to networking).  A so-called HTTP session consists of a series of requests from the browser followed by responses from the server.  Each request consists of a request method, a resource (URL), a set of headers and optionally a request body.

The most common HTTP request methods are verbs such as GET, POST, PUT and DELETE (I am capitalizing them because that’s how they appear in the protocol).  What’s great about these is that they are wonderfully descriptive of what you expect the request to do.  GET is supposed to, well, get information from the resource.  I say resource rather than server because that is the right level to think about with regard to HTTP — it is about manipulating abstract resources rather.  PUT on the other hand puts information at the resource (without regard to what’s already there).  DELETE — you get the idea — deletes the information at the resource.  This relative obviousness and some associated expectations around how these methods behave provides a powerful foundation for the transfer of information (more on that in a future post on so-called RESTful APIs).

The headers contain additional information about the request.  For instance, the “Date” header field contains the date and time when the request was sent.  Or the “Referer” header (misspelled in the protocol and in most implementations!) contains the URL of the page on which the currently requested resource was found.  It is worth looking at the list of possible HTTP headers, which also shows the headers for a response (see below).  It should be pointed out that the HTTP protocol allows for the creation of additional headers which can carry custom information (not always what you would want as in the recent case of O2 sending users’ phone numbers!).

The request body is used for POST and PUT requests to carry the data.  For instance when you encounter a registration form on the web that asks for your name and email address, the information you type into the form fields is (generally) carried in the body of the resulting HTTP request.

Once the server has received and processed the request it will send an HTTP response.  The response has a structure that’s quite similar to the request.  Instead of the method, the server returns a status code, then some headers, and finally a response body.

The status code indicates what happened at the server and hence what to expect in the body of the response.  The standard code is “200 OK” which means the server processed the request and everything went well.  There are more precise responses in that vain, such as “201 Created” which means the server created a new resource (e.g. in response to a PUT request).  There are a series of codes to deal with resources that have moved, such as “301 Moved Permanently” which provides a new URL that should be used for all future requests for this resource.  And there are a bunch of codes to indicate various error situations such as the famous “404 Not Found” for which some web site return very funny contents in the body of the response. Again, it’s worth browsing the complete list of response codes.

The response headers contain a lot of additional information about the response.  For instance the “Content-Type” header field describes what kind of content the response body contains.  Examples of different values for this header field are “text/html; charset=utf-8” for a web page in HTML using the UTF-8 character set or “image/jpeg” for an image that is compressed using JPEG.  Without knowing this the browser would have to infer the content type from inspecting the body of the response which would be very cumbersome.  There are a ton more headers in a response that are similarly critical to the proper functioning of the HTTP protocol, such as how long a recipient can “cache” (locally store) the body (in order to help speed up a subsequent access and also relieve the server and network).

Finally there is the body of the response which contains the actual information.  The body is a bunch of bytes.  What they represent can vary wildly as explained above.  It could be an HTML web page or an image or something altogether different.  One of the great powers of the HTTP protocol is that it is really content agnostic.

Because there is a lot going on with the HTTP protocol under the hood and much of it matters it is a bit of a shame that many people including active developers don’t really understand it and as a result either create things that don’t work as expected (e.g. making resources change in response to a GET request) or re-invent features on top of HTTP that HTTP already contains (e.g., content caching).  If you do any work on the web it is well worth digging deeper than this post!

Posted: 31st January 2012Comments
Tags:  tech tuesday web HTTP

Tech Tuesday: DNS

Today we are continuing with the web cycle that I outlined two weeks ago.  After a URL has been parsed in Step 1, the browser needs to determine the IP address for the domain as Step 2.  Reprising the previous example, let’s consider the domain name dailylit.com.  How does the browser determine that in order to retrieve information from this domain it should access a server at IP address 72.32.133.224?  This is accomplished via a system called DNS, which stands for Domain Name System, and provides essentially the equivalent of a telephone book which provides IP addresses (telephone numbers) for domain names (people names).

In ARPANET, the predecessor to the Internet, there were so few domain names that this telephone book was simply a file called HOSTS.TXT that was retrieved from a computer at SRI and stored locally.  There were only  a few domains (mostly universities) and the file was relatively short.  Today on the Internet there are over 200 million domain names of the type dailylit.com, which are further subdivided through subdomains such as blog.dailylit.com.  So the idea of having every computer maintain a complete and up-to-date copy of the telephone book locally doesn’t make sense any more.

Thankfully in the early 1980s, which depending on your perspective is either ancient pre-history or not that long ago, DNS was born as a service that would allow the registration of domain names and maintain a mapping between the names and IP addresses in a robust fashion.  In fact, without DNS it would be hard to imagine the Internet having grown as dramatically and we probably wouldn’t have nearly as many domains to begin with.

There are many ingenious ideas in the design of DNS and I won’t be able to cover them all here.  Instead, I will focus on some key concepts.  The first and central one is that there is a hierarchy of authority which allows for the delegation of both registration of domain names and the lookup of IP addresses.  The hierarchy starts with the 13 root servers which together make up the so called root zone from which all authority flows.  It is here that the so-called Top Level Domains or TLDs get resolved.  Going back to blog.dailylit.com, the TLD is the “.com” part.  You can think of a domain name like nested Russian dolls, where the outermost doll, the TLD, is the rightmost part of the name.

The most common TLDs are .com and .net which together account for about half of all domain names.  There is of course also .org, .gov., .edu and an ever increasing number of other TLDs such as most recently .xxx.  And then there are TLDs for countries which all consist of two letters, such as .uk for the UK (duh) or .ly for Libya, popularized by bit.ly, and .us for the US, which made the domain del.icio.us possible. Each TLD has one or more registrars associated with it who are in charge of letting people and companies reserve names in that domain.

The root servers point to name servers for each of these TLDs.  Since blog.dailylit.com is in the .com domain the next place to look is the .com name servers.  The .com name servers in turn point to the name servers for dailylit.com itself.  Currently those name servers are at Rackspace.  Since Susan and I registered and control dailylit.com, we are the ones who get to decide which nameservers should be queried to find the IP address for dailylit.com and its subdomains, such as blog.dailylit.com.  The way this generally happens is by logging into a system run by a registrar and setting which nameservers are to be the authoritative sources of IP addresses for the dailylit.com domain. That then gets recorded in the nameserver for the corresponding TLD.

The lookup process that started with the root, went to the .com TLD, is now at the dailylit.com nameservers at Rackspace.  They in turn contain information on dailylit.com itself and its subdomains, such as blog.dailylit.com.  The whole process of starting at the root and working towards the subdomain (right to left) in a series of separate lookups across different servers is called a “recursive lookup." If this sounds complicated to you, that’s because it is.  It is so complicated and resource intensive that we don’t want the web browser to have to do this each time it encounters a domain name.  It would not only be slow, but it would also swamp the root servers, the TLD servers and possibly even the name servers for dailylit itself.

So instead of doing a recursive lookup every time, the results of these lookups are stored on so called DNS cache servers.  For instance, most ISPs through which you access the Internet will operate their own cache servers.  After they have looked up blog.dailylit.com once, these servers will “cache” (meaning temporarily store) the result of the lookup, thus providing a much faster lookup the next time.  In fact, your own computer will often cache the results of lookups locally for super fast access.  This is important both because even a single web page generally involves multiple requests (e.g. for images) to the same server.  The duration for which the results of a recursive lookup can be cached locally is known as the Time To Live or TTL and is controlled by the owner of the domain (and generally honored by the cache servers).

The existence of cache servers (sometimes also referred to as non-authoritative servers — although technically not exactly the same) provides a critical security vulnerability for DNS.  Let’s say you have gone to your favorite coffee shop and logged on to the WIFI network there.  Where do your domain lookups go?  Well to the cache server of whatever ISP the coffee shop uses or possibly even cache servers on the coffee shop’s own network.  An attacker with access to those local cache servers could insert falsified records that could have the effect of say pointing chase.com to some rogue server that wants to steal your bank username and password. This would allow for a so-called man-in-the-middle attack (more on this in a future post).  Fortunately, some security additions to DNS known as DNSSEC will in the future prevent these kinds of attacks.  As more and more of our access to the Internet is over wireless networks this becomes particularly important.

If you made it this far, I hope you have a (newfound) appreciation for the complexity of a system that is used billions of times per day behind the scenes of nearly every access to the Internet. In addition to the technical issues there are also important political issues surrounding DNS.  Most recently the proposed SOPA and PIPA legislation would have mandated nameserver operators to make changes that would have interfered with the implementation of DNSSEC.  Then there is also the question as to who really controls the root zone which turns out to be the US Department of Commerce.  Yes, for the *entire* Internet, which is all the more reason why we should make DNS better not worse.

Enhanced by Zemanta

Posted: 24th January 2012Comments
Tags:  tech tuesday web dns

Older posts