Tech Tuesday: Data Structures (Objects)

This is the fifth post on Data Structures as part of my long Tech Tuesday series on programming. And today we finally make it to a structure known as an object. There is a whole literature on so-called object oriented programming, which sometimes seems like a modern idea because of its prominence in the 1990s but really goes much further back than that with programming languages such as Smalltalk in the 1970s and basic object concepts go back even earlier than that. While another approach known as functional programming has been all the rage recently, yesterday brought a big announcement from Microsoft about a language called TypeScript that brings strongly typed objects to the Javascript environment.

So what is that mythical object? At its most basic level it is simply an extension of the kind of data structures that we have been looking at. An object contains data attributes but also bundles in programming code that can operate on those attributes. We have been using a point as our example (which funnily enough is also one of the examples used on the TypeScript homepage). We have seen a variety of different ways so far of keeping the x and y coordinates of a point together, including structs, associative arrays and JSON. Now here is an example of a point as an object in PHP which you can see running here:

We start by declaring a class called Point. What’s a class you ask? It is the data type for our points. So far this is similar to the structs where used typedef in C to give ourselves a data type. Every point object that we later create in the code belongs to the class Point (which is another way of saying that the objects have the data type Point). Unlike the struct though, the class also contains some code. In the example above that consists of the oddly named function __construct and a function for calculating the distance from this point to another point.

What is the advantage of putting the distance code into the Point class as opposed to having a separate free floating distance function as we had for the structs? Imagine a future self (or some other engineer) making a change to how points are represented. Now you need to find all the functions that refer to the internal structure of points and adapt them accordingly. By putting these functions into the class, we know exactly where to find the code. That’s the purely pragmatic aspect.

But there is a deeper philosophical view that’s really important. By putting so-called methods into the class (that’s what we call the functions that are associated with an object data type) we are creating an abstraction. Both the internal representation of the object and the internal workings of the code are encapsulated inside the class and thus “abstracted away” from anybody simply using the object.

That is one of the fundamental ways to think about programming: the gradual building up of higher levels of abstraction. With every level of abstraction we gain more power. We could use the point objects to come up with line objects. Then line objects to create shape objects. And so on. The highest level abstraction is the program itself. That’s why people care so much about objects in programming. We will spend a lot more time on them in the future, including learning what the somewhat mysterious __construct function does, but before that we will continue with some more data structures.