Monday, January 23, 2012

Posted at 8:00am Comments (View)

Why Software As A Service (A Personal Reminder)

I am in the process of migrating a bunch of stuff off an ancient server which has been up continuously for 5 years and is being sunset (the hardware is tired and RedHat is ending support for RHEL4).  The migration process has been a potent reminder as to the many hidden costs of installed software.  For instance, at the time that the old machine was set up I used Subversion for version control and Trac for documentation and ticketing.  Once these were working nicely for my purposes, I stopped upgrading them.  Now you could say that’s a pretty stupid thing to do, but when you have very limited time would you rather fix a bug or upgrade your tools, especially when those are working!

So now of course I had a brand new machine running RHEL5 and much more up-to-date versions of Subversion and Trac.  I have been using Git and Github for a more recent project but didn’t want to make a switch here as there is a lot of info in Trac that is nicely sync’d to the source.  So how hard could it be?  Well, the Subversion upgrade was a cinch.  I dumped the repository on the old machine and loaded it on the new machine.  Including time to set up Subversion, configure Apache to serve up the repo, and my laptop to connect, this all took less than 1 hour. 

But then I got around to Trac where I was on version 0.10.4 and the current version is 0.12.2.  As it turns out in between they switched templating engines and went from SQLite2 to SQLite3.  I had to upgrade the old machine through several versions before I was able to export my data in a format that the new install could consume.  Even then the documentation on how to do this was spotty at best.  At one point I got so frustrated that I thought it might be easier to downgrade my new install to 0.10.4 but an attempt to install that old version on RHEL5 went nowhere.  In the end I got it all done but it took almost 3 hours. Granted that someone who does this more often than once every other year might have been much faster, but my Google searches suggest that I wasn’t the only one to find this a bit tricky.

I still have quite a few other things to migrate to the new machine which I am sure will produce some similar problems.  I could of course try to find someone to do all of this, but (a) that’s not easy given that this is a small one-off project and (b) I like to stay connected to how things actually work.  And this object lesson in the cost of installed software was a good reminder for when this topic comes up with startups.  For anything that’s not core to your success, consider using a Software As A Service offering over installing and running your own.

Enhanced by Zemanta

Tags:

Friday, January 20, 2012

Posted at 9:13am Comments (View)

Some Quick Observations on MegaUpload

Yesterday an international police operation resulted in the shutdown of MegaUpload and the arrest of at least four MegaUpload employees in Auckland, New Zealand.  This action resulted in a large scale DDoS attack by the group known as Anonymous on web sites including the MPAA, RIAA, DoJ and even the White House.  While I don’t have time today for a full scale analysis here are some salient point:

1. The fact that this shutdown and the arrests were possible shows quite clearly that existing laws already provide a meaningful ability to deal with large scale copyright infringement even when sites operate from abroad.  That’s all the more reason why we don’t need additional new legislation.

2. According to ArsTechnica, MegaUpload was brazenly flaunting the DMCA by only disabling links to infringing content instead of actually removing it or blocking access to it entirely.  That is a violation of both the letter and the spirit of that law and should not be allowed to continue.

3. As with any digital locker site, there were also legitimate uses of MegaUpload.  Many people who had work or personal files on MegaUpload have taken to Twitter to complain about a lack of access to their files.  This operation and others before it (such as the server seizure that brought down Curbed, Pinboard and Instapaper) raise the question how to minimize “collateral damage.”

4. The retaliation by Anonymous has the potential to meaningfully escalate the push for government intervention in the Internet for cybersecurity reasons.  This comes at a bad time as we are trying hard to keep the government out of controlling the Internet.

What a week this has been!  Apple too dropped another interesting copyright bomb yesterday by claiming sales rights to any books created with iBooks Author. It seems like we are at the beginning of what Cory Doctorow has characterized as the “Coming War on General Purpose Computing.”  We live in interesting times indeed.

Tags: megaupload copyright sopa

Thursday, January 19, 2012

Posted at 10:48am Comments (View)

The Day the Internet Stood Still

Yesterday (Wednesday, January 18), has a good chance as being remembered as the day that the Internet first truly showed its political clout in the US.  So far we have largely pointed at events abroad when discussing the Internet’s potential to shift power.  Web sites and services large and small (including Continuations) either forcefully alerted their users to the problems with SOPA/PIPA or blacked themselves out entirely.  At the 12th hour even Facebook’s Mark Zuckerberg took a (by now very safe) stand on the issue. 

The results from a political perspective were impressive.  SOPA had already been stalled a bit but PIPA support was still strong.  Following yesterday though 18 Senators including 7 co-sponsors withdrew their support for PIPA.  Someone with a better knowledge of the history of American politics will probably know the correct statistics but this is a massive erosion in the support for a bill.  Together with the White House’s stance against the bills in their current versions I believe that there is now a good chance to stop both SOPA and PIPA.  

What’s next?  First, as Ron Wyden makes clear in his terrific letter to the Internet there is still one more vote coming up on PIPA on January 24th so it is too early to declare victory.  Second, it is worth reading the MPAA’s reaction to yesterday’s expressions to see just how cynical their view of what happened is.  Third, unless we want a new wave of slightly different versions of these bills following the next election we need to proactively outline an alternative that is not based on government intervention in the Internet.  Fourth, and maybe most importantly, we need to start the long work on using the Internet to shift political power back to the voters and away from special interests more generally and not just with respect to bills that directly affect the Internet.

Enhanced by Zemanta

Tags: politics internet

Tuesday, January 17, 2012

Posted at 7:30am Comments (View)

Tech Tuesday: Anatomy of a URL

Last week’s overview of “How the Web Works” introduced the URL (Uniform Resource Locator) as the fundamental way things are addressed on the web.  Before we pick apart some actual URLs, it is worth looking at the name itself.  The promise behind “Uniform” is that this addressing scheme can be used across all kinds of resources and that explains why URLs are so powerful - they can be used to address content such as a blog but also services such as the Twilio telephony API.  On the web a blog entry and an incoming phone call are both simply “resources”.  That means a resource is a highly abstracted concept and as you will learn if you stick with Tech Tuesday, abstraction is amazingly powerful.  And on the web the URL is the most powerful abstraction of them all!

So here is a URL to pick apart:   

http://blog.dailylit.com/2012/01/16/in-honor-of-dr-martin-luther-king-jr/

The very first part, the “http:” indicates which protocol to use to access this resource.  What other protocols might we find there? The obvious one is https: the secure (meaning encrypted) version of http:.  Here some other protocols that you may have encountered around the web “mailto:” which indicates that the resource that follows is an email address and the protocol to speak to it is SMTP or you may have seen “ftp:” for resources that are accessible via File Transfer Protocol (FTP).  Another protocol supported by many browsers is “file:” which means that the resource that follows is a file on the machine on which the browser is running.

Following the “http:” are two forward slashes “//” — these indicate that this URL starts with a domain name, which in this case is “blog.dailylit.com” — we will dissect domain names in more detail in the Tech Tuesday on DNS.  There we will investigate the relationship between domains and actual servers but for now it is worth pointing out that grouping resources by domain serves an important trust purpose.  Your expectations about accessing content at chase.com are meaningfully different from wepretendtobechase.com.  Of course it’s not always that obvious and people go to great lengths to pretend to be someone else.  There is a good test of your knowledge of which domains to trust.

Following the domain name is the location of the resource within that domain.  This is the “/2012/01/16/in-honor-of-dr-martin-luther-king-jr/” part in the URL above.  There are several things going on here that are worth noting.  First, this location is structured in an easily human readable and comprehensible form.  Just by looking at the URL you can infer that this is a post about Martin Luther King on MLK day.  We call this kind of location a “pretty URL.” Having pretty URLs is a good idea not just because it helps humans figure out what they are likely to get when they access the resource but also because search engines, especially Google, make pages with pretty URLs rank higher in search results (assuming that the page content actually appears to be a match for the URL).

But there is even more to a pretty URL like “/2012/01/16/in-honor-of-dr-martin-luther-king-jr/” — the slashes “/” in the URL indicate some notion of hierarchy or of a path to the resource.  It also suggests that the following shorter URL should point to something useful http://blog.dailylit.com/2012.  In fact this retrieves all the blog posts from 2012.  There is no requirement that the domain fulfilling the request understand this shorter URL, but the fact that it does corresponds both with intuition and allows for additional degrees of automation and discovery.  For instance, without any further knowledge you should be able to construct the URL for finding all the blog posts from November 2011.  Here it is http://blog.dailylit.com/2011/11/ .  Again, there is no requirement on the server to respond to this with a list of posts and it could instead respond with say a 404 Page Not Found.  The http protocol does not speak to this, which is one of its many strengths as it lets the person or organization controlling the resource decide how to respond.

Now not every URL starts with a “//” — there are also URLs that don’t contain a domain but instead just a path to a resource.  Consider for instance the following http:/ — where the resource pointed to by this URL is located depends on the context in which it is encountered.  This is an example of a relative URL.  It points to a resource within the context of another resource.  If you are reading this in the context of the Tumblr dashboard, the link will take you to your dashboard.  If you are reading this on my blog, which is at the domain “continuations.com” it will take you to the home page of my blog.  Relative URLs allow for more compact expression of the location of a resource but they can also introduce interesting errors.  For instance, think about what resource that relative URL will point to if you simply copy it and send it to someone via email and they open it in a web mail client!

This post is getting quite long and I haven’t yet covered fragment identifiers or query strings.  Instead of going on, I will cover fragment identifiers in the context of HTML and query strings when describing how URLs can be used to transmit additional information that can be used by the server in deciding how to respond to the request to the resource, so keep following Tech Tuesday!

Tags: tech_tuesday url web

Monday, January 16, 2012

Posted at 9:44am Comments (View)

Covestor: Getting the Message Out!

Our portfolio company Covestor today rolled out a tremendous overhaul of their web presence.  The goal was to dramatically simplify Covestor’s message to make it easier for that message to spread and to improve conversion.  The team has done an amazing job with a process that used extensive quantitative and qualitative research into consumer reactions to inform the redesign.  While it’s too soon to tell how well they will hit their numeric goals, my immediate reaction to the new site is: wow!  I encourage everyone who has wondered in the past what Covestor is about to go and check out the new Covestor site.

The overhaul touched every aspect of the site from updating the logo to changing pretty much all the copy.  Here is the old logo on the left and the new one on the right:

What is externally visible though is only the tip of the iceberg. Internally, Covestor has done amazing work over several years to be in a position to roll out this new site.  For instance, they have recruited hundreds of model managers to the platform.  They have also developed a sophisticated risk score to help match models to investor strategies. And they have honed the trade replication engine that powers covesting.

Congrats to the entire Covestor team!

P.S. If you want to know more about Covestor in person, you can meet the team and others interested in covesting at their “Take Stock Mixer” on January 31st from 6-8pm.

Enhanced by Zemanta

Tags: covestor investing

Friday, January 13, 2012

Posted at 8:06am Comments (View)

Moving Back to New York City and Homeschooling

In the middle of 2010 we started to seriously consider moving back to New York City.  At the time one of the considerations was that it would be possible to experiment with homeschooling the kids.  I am excited to report that we are doing both.  We have a place in Chelsea that is a short walk from the Union Square Ventures office and almost as importantly around the corner from Murray’s Bagels. We are also homeschooling our kids for at least the next six months.  Now the “we” here shouldn’t be read to imply that Susan and I are doing the teaching - we both work full time.  Instead, we have worked with Teri and Melissa from QED to recruit some amazing tutors.  Susan has all the details on that over at a special blog about our homeschooling experiment.

Tags: new_york_city personal homeschooling

Thursday, January 12, 2012

Posted at 7:07am Comments (View)

Presenting Option Grants to Boards

One of the nearly routine items at startup board meetings is the discussion and ratification of option grants for new employees and possibly refresh grants for existing employees.  Too often unfortunately this information is presented to the board in a way that requires way more time than should be necessary because critical pieces are missing.

Here is what you should always include as a bare minimum when presenting option grants to the board

1. Employee name

2. Title/role at company

3. Absolute size of grant in number of underlying shares

4. Percentage size of grant fully diluted

5. Total size of option pool and remaining available pool (absolute numbers and percentages fully diluted)

Just #1 and #3 are not enough because it assumes that board members will immediately recall the number of shares outstanding which may or may not be the case and/or be able to quickly do the percentage calculations.

In addition here is additional information you should provide for context

6. Grant size bands by role (if you have established those already) — if not, include existing employees in similar roles for comparison (including their start dates)

7. Indicate if there are any special vesting considerations that differ from the plan

8. For refresh grants: how many options does the employee already have and how far are those vested?

Providing all of the above ahead of time (or even at the board meeting) will not only make the grant process quick an painless but also assures that you can actually get meaningful input from your board on the size of grants (as opposed to spending time digging for information and calculating percentages).

Tags: startups boards directors options

Wednesday, January 11, 2012

Posted at 7:53am Comments (View)

Google Going All In

Last July I had predicted that Google would go all in by bundling Google+ aggressively with search and that is exactly what was just announced yesterday with Search, plus Your World.  The “plus Your World” part right now refers “your world on Google” as only Google+ profiles, posts and shared images are included and not content from Twitter, Facebook or others.  John Batelle’s capture this well in his aptly titled “Search, Plus Your World, As Long As It’s Our World.”  

Also worth reading are Danny Sullivan’s excellent overview of what Search+ offers and his detailed analysis of whether or not Google could already include some Twitter content without a commercial arrangement with Twitter.  Danny’s analysis has actual comments from an interview with Eric Schmidt.  Finally, the most scathing reaction has come from MG Siegler who flat out titles his piece “Antitrust+.”

While it’s too early to know how all of this will play itself out over time (there has already been some public back and forth between Google and Twitter), two things seem fairly clear.  First, in the near term this will be bad for end users.  Second, the root of the problem are Google’s economics for search.  The two point are intimately related.

On the first point, John Perry Barlow aptly tweeted: 

From an enduser perspective the best web is one of little pieces loosely joined.  That kind of web allows for lots of innovation and individuality.  Instead, we are currently headed for big chunks of experience provided by just a couple of players.  While a high degree of integration may look appealing to some under an “ease-of-use” type argument, all you have to do is look at the enterprise where a few large vendors have dominated for years (SAP, Oracle) to know how undesirable that is.

On the second point. the root cause of all of this are search economics.  Google keeps one hundred percent of the search revenue from searches on Google.  The explicit quid pro quo has always been that Google sends traffic to a site in return for getting to include the content among the search results.  No search revenue is shared with the sources.  During days when Google was just a search engine that seemed like a reasonable quid pro quo.  But two things have happened to make this balance not work.  First, Google has gradually entered many businesses that compete directly with providers of content and second we have seen the emergence and inclusion of many content “micro chunks” that will hardly ever generate traffic to the originating site, such as a restaurant rating from Yelp.  I have argued before that some kind of revenue sharing will be required to break through this.

When Larry Page became Google’s CEO I had hoped that he would maybe pursue a vision of the web of little pieces loosely joined with Google providing a lot of that glue.  It is by now amply clear that Google is going exactly in the opposite direction.  That’s a shame in the near term.  In the long run I agree with John Batelle that the web will find a way to route around all of this (assuming we don’t let the politicians screw it up in the meantime).

Tags: google search economics competition

Tuesday, January 10, 2012

Posted at 7:00am Comments (View)

Tech Tuesday: How The Web Works (Overview)

As promised at the end of last year’s Tech Tuesday, we are starting this year with a cycle on how the web works.  Just as a reminder, Tech Tuesday’s aim is to require no previous knowledge other than what has been covered before.  So this overview may be trivial for some readers but I wanted to make sure to bring everyone along.

Let’s assume you have fired up your favorite web browser.  Now you type the address “dailylit.com” into the address bar (if you always go to web sites by typing their name into a search engine, I urge you to discover the address bar and type in the address). 

What happens now?  How does the browser go from a web address for a site to that site’s content on your screen?  That turns out to be an amazingly complex series of steps:

Step 1:  The address “dailylit.com” is part of what is known as a URL.  The full URL is “http://dailylit.com” and your browser automatically pre-pends the “http://” to save you the typing.  The HTTP bit indicates to the browser which protocol to use to speak to the server (more on that in Step 3 below).  Typing a URL into the address bar starts the same sequence of steps as if you had clicked on a link (e.g. among a set of search results) pointing to the same location as in DailyLit.  In the very first step the browser “parses” the URL (meaning it takes apart the URL into its various parts) in order to determine where it is supposed to look for content.

Step 2: The “dailylit.com” portion of the URL is the domain name (you can get your own from a domain registrar).  Think of this much like the name of a person.  If you want to call a person on the phone you need to look up their phone number based on their name in some phone book (e.g. the contact list on your cell phone or in the dark ages some paper book made from dead trees).  Similarly in order for your browser to retrieve the content from DailyLit, it needs to first lookup the IP address of the server on which the content lives.  This is done by consulting a “phone book” known as DNS which stands for Domain Name System and is a near miraculous invention.

Step 3: Now that the browser has an IP address, in the case of DailyLit currently 72.32.133.224, it makes a request to “GET” content from 72.32.133.224.  GET is capitalized and in quotes here because it is one of several defined requests supported by the so-called Hypertext Transfer Protocol or HTTP — which was what the beginning part of the URL.  In essence this request simply says GET me the content that resides at 72.32.133.224.  This is the protocol that got started with Tim Berners-Lee’s work in the very late 80s and early 90s and is to this day the underpinning of the interaction between web browsers and web servers.

Step 4: Through the magic of Internet networking, that GET request is routed via a whole bunch of intermediary devices (routers, switches, firewalls, load balancers oh my!) to the machine with the IP address.  In fact, you can look up for yourself how many intermediate hops exist between you and the server and that’s something we will do in an upcoming Tech Tuesday

Step 5: We are now on the Server.  The server receives the incoming GET request.  On the server machine the work is co-ordinated by a program known as a web server (something like Apache or NGINX).  The web server retrieves the contents for the page and starts sending them back to the browser again over the Internet.  Important side note: because the Internet is packet switched, the content is cut up into smaller parts (packages) that may travel different routes to get back to the browser.  All of that cutting up and re-assembling is handled by lower levels of the network and is transparent to both the web server and the web browser.  That too is one of the many awesome features of the Internet that we easily take for granted.

Step 6: Back at the browser.  The browser is receiving the content in the form of an HTTP Response.  That response contains a bunch of different stuff.  For instance, it contains a so-called HTTP Response Status Code to indicate to the browser whether the server thinks it has some useful information [Response Code 200 OK].  If the server had a problem, e.g. it didn’t find have any content for this URL it will send a different code, such as the famous 404 Page not Found.  The browser needs to start parsing the response to figure out what to do next.  That will in all likelihood include many additional requests by the browser to the same and possibly other servers to retrieve content that was referenced in the initial response, such as CSS and Javascript files.  Every one of these additional requests involves all the steps from 1-6 AGAIN!

Step 7: Even while it is still waiting for the responses from these additional requests (and possibly even more pieces of the original request) to arrive the browser will start to figure out how to render the content that it has received on the screen.  That means figuring out what to show where, which is made incredibly complex by the interaction between the HTML (roughly: the content itself), the CSS (roughly: the styling or look and feel of the content) and the Javascript (roughly: the dynamic behavior of the content).  This work involves a so-called rendering engine and also a full fledged computer language interpreter (for Javascript).

Step 8: The browser continues to execute the Javascript code (which might, for example, animate an object to move across the page) while at the same time waiting for input from you.  For instance, when you hover with the mouse above a link that might change the look and feel of that link.  In the early days of the web, the most that would happen now is that a click on a link will start the whole process over at Step 1 for the next page.  Today, however, many additional requests to the web server may occur without the page ever refreshing as new content is dynamically fetched and added to the existing page and other content written back to the server.

In the upcoming Tech Tuesdays we will look at each of these steps in some detail, starting with the anatomy of a URL next Tuesday.  In the meantime, I hope I have managed to convey some of the amazing complexity that is involved in something that we now take for granted and people interact with billions of times every day around the world.  And all along you should keep in mind that I haven’t even mentioned any of the complexity behind the scenes, such as the browser interacting with the computer’s operating system to make all of these steps happen.

Enhanced by Zemanta

Tags: tech_tuesday web overview

Monday, January 9, 2012

Posted at 8:00am Comments (View)

The Internet Is a Human Right

At first, I was surprised to see a New York Times OpEd by Vint Cerf with the title “Internet Access Is Not a Human Right.”  But once I started to read I began to understand the point that Cerf was trying to make.  It comes out clearest in the sentence “technology is an enabler of rights, not a right itself” which in modified form is also part of our investment thesis at Union Square Ventures.  We don’t invest in technology per se, but what that technology allows startups to build.  That’s an important distinction.  We don’t go looking for “mobile startups” but rather startups that use “mobile” to do something that wasn’t possible before.

Yet I think Cerf is selling the Internet short, which is ironic given that he is one of its co-creators.  The Internet is not really a technology but rather a set of principles that have become embodied in a bunch of different technologies.  I am going to quote at some length from a document that Cerf also co-authored about the history of the Internet:

The Internet as we now know it embodies a key underlying technical idea, namely that of open architecture networking. In this approach, the choice of any individual network technology was not dictated by a particular network architecture but rather could be selected freely by a provider and made to interwork with the other networks through a meta-level “Internetworking Architecture”

and

Four ground rules were critical to Kahn’s early thinking:
[1] Each distinct network would have to stand on its own and no internal changes could be required to any such network to connect it to the Internet.
[2] Communications would be on a best effort basis. If a packet didn’t make it to the final destination, it would shortly be retransmitted from the source.
[3] Black boxes would be used to connect the networks; these would later be called gateways and routers. There would be no information retained by the gateways about the individual flows of packets passing through them, thereby keeping them simple and avoiding complicated adaptation and recovery from various failure modes.
[4] There would be no global control at the operations level.

Those turned out to be a powerful set of principles that enabled the massive innovation that the Internet has brought about (for some more context you can also read my Tech Tuesday post on networking).  These may seem like dry technical principles but  embedded in them are some profound social implications.  This is probably most obvious with the last of Kahn’s “ground rules” stating that “there would be no global control at the operations level.” While it has been pointed out that this may have been in part motivated by wanting to avoid a single point of failure/attack to create a network that might be sustainable even during or after a nuclear war, it also meant that decentralization was architected into the very heart of the network.

Black box routing too may seem like a solely technological concept.  Yet it embeds within it an important separation of labor between various parts of the network that has had a profound impact on how innovation can take place and who wields power.  In particular that ground rule is what has given us so far a fairly neutral network.  And as longtime readers of this blog know, I am an ardent supporter of preserving that network neutrality and making sure it extends to wireless networks as well.

If you paid close attention, the headline for my post it is not the exact inverse of Cerf’s who wrote “Internet Access” - I simply talk about the “Internet” by which I mean a set of ideas that is grounded in these original principles behind the architecture of the Internet.  At their heart all human rights are ideas and highly abstract ideas at that, such as equality and freedom.  How we concretely instantiate these ideas through legislation and social norms has changed dramatically over time and much of that change has been driven by technology.

So when I claim “The Internet is a Human Right” I mean that the legislation and social norms that we use to operationalize abstract rights such as freedom of speech should be embracing not fighting the principles of the Internet. For example, freedom of speech will be a hollow right if movie studios can make entire web sites disappear off the Internet without due process, as is currently contemplated by the legislation known as SOPA. That is the exact opposite of the principle of decentralized control.  To be clear, I am pretty sure that Cerf shares this view as he has come out against SOPA.  That’s why I wish his OpEd had focused on the Internet as a set of ideas rather than a technology.

Enhanced by Zemanta

Tags: