Decentralizing Identity

If you ask someone who they are on the Internet, they will likely give you an email address or point to their profile on Twitter, Facebook, Google, or LinkedIn. Some people might instead reference a profile on a more industry specific network such as Behance for creatives or Doximity for doctors. Others use a personal home page provider like about.me and an even smaller fraction have their own domain. This is a pretty unsatisfactory state of the world from both an individual and a service creator perspective.

As an individual, you are never really in control of your identity. In every case other than your own domain a centralized service provider decides what can and cannot be on your profile and can also revoke your profile at any time (most terms of service give the provider nearly complete control). Even with your own domain there is a risk that it could be seized and your identity wiped out.

As a service creator, you can either let users authenticate with one of the big centralized providers or revert to signing in with username/email and a password (where email for most people is right back at large service provider). How much information you then receive about the person and the format for that information is controlled by the authentication provider.

Starting a new centralized identity provider not only doesn’t solve these problems but also faces a classic chicken and egg problem between user and service adoption. Therefore people have been looking for a decentralized solution for quite some time that would put individuals truly in control and allow for permissionless innovation. At first this ran into a problem known as Zooko’s triangle, which was the conjecture that you couldn’t have system was secure, decentralized and allowed for memorable all at once. As it turns out though this is exactly the kind of problem that can be solved using the Bitcoin protocol.

Namecoin is a decentralized key/value store for registering, updating and transfering information based on Bitcoin. Namecoin allows the creation of globally unified namespaces that can be used for all sorts of applications, including a decentralized domain system and personal identity. Namecoin itself only provides the consistency mechanism. It does not define a format for what should be contained in an identity entry. 

There are at least a couple of proposals for doing that. One is Namecoin ID and the other is a new project called OneName, which provides both a JSON specification and an initial implementation of a profile viewer. You can use the viewer to see my profile (here are FredNick and Brad). Both for Namecoin ID and OneName the underlying identity information is contained in the Namecoin blockchain.

What about squatting and impersonation? It is true that someone could register your name and even add links to your various accounts. But only you can also broadcast on those channels and confirm your data by linking back to it, eg by Tweeting out a link or adding one to your github account. None of this will add up to certainty and systems going forward will always have to deal with a probabilistic notion of identity.

The squatting issue is potentially more serious but also intriguing. Centralized systems have resolution mechanisms for squatting with various levels of transparency and inconsistent national legal frameworks. Given the fully decentralized nature of Namecoin there is no authority to appeal to. That really only leaves a voluntary market based mechanism for resolution. By building on top of a distributed currency the payments and transfer mechanism are all built in from the beginning.

For now OneName is as an alternative to something like about.me with the big distinction that you control your data. You can access your OneName profile directly by using any Namecoin client if you so want. Much like Bitcoin though if you don’t want to operate anything directly there can be third party registrants that handle everything for you. Anyone can set one of these up, there are no licensing requirements of any kind and no barriers to entry. This means that a competitive market can emerge where registrants can compete on price, on convenience, on trust and safety, or some combination of these and other forms of differentiation.

How does all of this solve the identity problem? Because Namecoin is completely decentralized it is ideal for permissionless innovation as the OneName example shows (the spec and implementation were developed independently of the Namecoin project). OneName aims to provide single user value by offering a pretty representation from day one that one can link to. Others can then use this information for purposes such as secure messaging and payment. Since this was just launched it is too soon to tell whether that is enough to get a critical mass of users to adopt. 

Whether it is OneName or Namecoin ID or something yet to come, once enough users add information to a block chain mechanism in a standardized format it will make sense for services to let users sign in using such a decentralized identity. Here too we will see permissionless innovation at work. The exact mechanism for authentication does not need to be specified in advance and can emerge over time leveraging existing auth systems, including of Facebook, Google, Twitter, OpenID, etc and adding new ones.

It is still early days for all of this, but the potential for these emerging decentralized identity systems is to further push power to the people and away from central authorities.

Posted: 10th March 2014Comments
Tags:  identity decentralization

We Live in the Cyberpunk Future: More to Come

I still remember how much I loved reading cyberpunk. During yesterday’s Satoshi story development I was struck by how much of the future envisioned in these stories has become a reality. Having a mysterious creator of a system of global reach that provides an important breakthrough in distributed computing is as riveting as anything I read back then. What’s even better is that it feels to me like we are in the early chapters with much more yet to come!

So that got me thinking about what else from those stories we should be looking for. And here are two things that strike me as increasingly plausible. First, truly distributed computing with code execution anywhere in the world. Public cloud computing is still dominated by Amazon, but Google is investing heavily, Rackspace is in the mix, IBM has acquired Softlayer and A16Z just invested heavily in DigitalOcean. The prices for access to compute power will continue to drop. But more importantly with something like Bitcoin we will have a system for paying for code execution anywhere (I will elaborate on this idea in a future separate blog post).

Second, the creation of an actual cyber space as a virtual world (built on top of truly distributed computing). When the web first came out people tried to build this on top of the web but browsers and connections were slow and more importantly VR headsets were heavy and super expensive. Now with the Oculus Rift and others in the works we will be getting fast, cheap VR. All we need then is a protocol that lets us connect virtual spaces together the way we do web pages and we will be off to endless worlds.

All of this has me massively excited about the future. In the meantime, if you do want to read something sensible on Satoshi, I highly recommend Felix Salmon's piece “The Satoshi Paradox.”

Posted: 7th March 2014Comments
Tags:  future

Linking Charges Against Barrett Brown Dropped

Brief post today with an update on an important case that I wrote about last year: the federal prosecution of Barrett Brown. As I wrote back then, linking is the essential building block of the web. Criminally prosecuting someone for linking threatens this foundation. I am therefore relieved that yesterday the prosecution decided to drop these charges. While this is great news it is important to point out that Barrett Brown has been imprisoned since the fall of 2012 — that is not a typo.

The two remaining charges are threatening an FBI agent and possession of credit card numbers with intent to defraud. I can’t speak to the first one at all. On the second one I don’t know what the evidence is, but possession of credit card numbers (even including the names, expiration dates and CVVs) should not count as intent. Why? Because by following a link on the web you might wind up with such a file on your computer! Furthermore, anyone who has ever held your card in their hands (like every restaurant you have been to) has this information. Nobody should be able to transact with just that information without a second factor of authentication.

I hope the remainder of Barrett Brown’s case comes to a rapid and just conclusion and takes into consideration his year and a half in custody.

Posted: 6th March 2014Comments
Tags:  barrett brown linking

Homeschool Wednesday: Permission-less Innovation

This will be the last Homeschool Wednesday post for quite some time. Not just because we will be going away on a trip to Africa but when we come back I will switch Continuations around a bit. I will use it (almost) exclusively to expand the ideas from my “Are We The Horse?" talk. Now I can make that change without asking anyone for permission because I am my own publisher. That is an example of "permission-less innovation" and is one of the key messages we are trying to convey to our children as part of homeschooling.

How are we doing that? By encouraging the kids to blog, create their own media, publish a portfolio of their projects, attend New York Tech Meetup, and so on. It helps too that they are seeing Susan start Ziggeo (she and Oliver presented the launch of Ziggeo’s API at yesterday’s NYTM and we were all there). In addition, the homeschool process in and of itself helps tremendously. There is a lot more free time which gives kids the opportunity to explore ideas and projects without asking for our or a teacher’s permission.

All of this matters because we are moving from a world of hierarchies where you have to ask your boss for permission to a world of networks where you are an independent actor.  If you haven’t done so already, I highly recommend that you read Alexis Ohanian’s excellent book “Without Their Permission" and leave a copy around the house for the kids to discover.

Posted: 5th March 2014Comments
Tags:  homeschooling innovation permission

Tech Tuesday: P = NP?

Last Tech Tuesday, we looked at complexity classes which group problems by their intrinsic difficult. There we encountered one of the great unsolved questions of Computer Science. Is P = NP? The class P is easy to define. It is the set of all problems that can be solved by a Turing machine in polynomial time, i.e O(n^k) for some k. So you might think that NP stands for non-polynomial time but you’d be wrong. Instead the class NP has the  same definition as P except for a non-deterministic Turing machine.

You may recall from the posts on Turing machines that for each state of the machine and symbol on the tape there is exactly one action to be taken (which results in a new state and possibly a symbol written to the tape and/or movement of the tape head). In a non-deterministic Turing machine there are multiple possible actions and the machine is assumed to be taking each of them simultaneously!

Now it is easy to see why such a machine could be very fast. Take the example that we used to motivate the study of complexity: guessing a string drawn from the letters A, G, C, T. We saw that at length 40 this would take 400,000 years for my MacBook. But a non-deterministic Turing machine could do it very quickly because it can essentially guess A, G, C, T simultaneously for each position. The running time for this machine would be O(n) for n being the number of letters in the string.

But wait, when I had introduced Turing machines I had made the claim that they are the most powerful model of computation. This non-deterministic Turing machine sure sounds more powerful. Yet, there is nothing that a non-deterministic Turing machine can compute that cannot also be computed by a normal one. How do we know that? Because we have a proof that shows that a non-deterministic Turing machine can always be simulated by a (normal) Turing machine.

Now my description of using a non-deterministic Turing machine already contained the seed for a different definition of the class NP. It is also happens to be the class of problems where we can verify a positive result in polynomial time. Consider the following problem. Given a set of integers, e.g. {-17, 2, 3, 12, -5, 21} is there a subset that adds up to 0? To answer this question an algorithm might proceed as follows: while there are subsets left, pick a subset and test if that subset adds up to zero and if it does report “yes” — if we exhaust all subsets report “no”. Now for a set with n integers in it, there happen to be 2^n subsets, so our loop might run for a very long time as n gets large.

If we focus on just the verification bit for a moment though, it is trivial. To verify if a subset adds up to zero, all we need to do is add up the numbers in that subset. The biggest possible subset has n elements (the set itself) and we clearly see that the verification step is linear in n, i.e. O(n). We call a subset that adds up to 0 a proof or witness because it shows that the answer to the question is yes. We see here a fundamental asymmetry. In order to answer yes, all I need to do is supply a single witness. To answer no, I need to show that no such witness exists!

We also see how this maps nicely to our distinction between normal and non-deterministic Turing machines. The normal one needs to crank through our code as outlined above. The non-deterministic one gets to work on all the subsets simultaneously. A great many problems have this structure. For instance, if give you an integer n and a second smaller integer f you can easily verify if f is a factor of n. But if n is large there a many candidate numbers. On DuckDuckGo you can see a long list of such problems.

So now back to our original question. Is P = NP? Based on everything we have just discussed it would intuitively appear that NP should contain at least some harder problems that are not in P. That is how the complexity class diagram from last week is drawn. But we don’t know that with certainty. To date there is no proof and some people doubt there will ever be one. In the absence of a proof there is some possibility that for every problem in NP we eventually find an algorithm that executes in polynomial time on a normal Turing machine.

This turns out to be a question of potentially great practical implications, especially in the area of cryptography. Public key cryptography relies on the asymmetry in which it is easy to verify that a message was encrypted (or signed) using someone’s private key when you have their public key. But all of that would be meaningless if you can easily derive someone’s private key from their public one (this happens to be related to the factor problem I mentioned above). Ditto for bitcoin. Finding a number to complete a block is hard. Verifying (by hashing) that the number does complete the block is easy. Again, it is hard to overstate just how many problems fall into this category.

Posted: 4th March 2014Comments
Tags:  tech tuesday theory complexity p vs np

Healthcare: Commerce Before Marketplace?

I am generally not a big fan of strategies that involve pursuing business A now in order to pursue B in the future. I tend to believe that you should go for B right away. For example, if you are planning to disrupt textbooks I would likely prefer a strategy that figures out how to peer produce those instead of one that first tries to make the renting and exchange of existing textbooks more efficient. Why? Because there are few businesses that have pulled off the “A now, B later” trick — most everyone simply gets stuck in business A. Netflix is one the few that pulled it off. They first shipped DVDs and now successfully stream (even Netflix they almost botched the transition).

Two possible justifications come to mind for the “A now, B later” strategy. The first is technological progress. When Netflix got started, streaming was not yet a truly viable option. The second one is behavior change. When Amazon first got going, buying online was enough of a change for people. It would have been too much to also ask them to do so in a marketplace and so Amazon chose the commerce format. Bezos smartly picked books where an online store had big advantages over a brick and mortar one.

This latter justification seems particularly relevant when looking at healthcare opportunities. After many years of content sites a la WebMD we are now seeing a great many startups that want to actually provide care. This ranges from medical Q&A sites all the way to telemedicine applications on mobile phones. Going to the computer or your phone for a consult rather than a flesh-and-blood doctor is a big behavior change. That suggests that an Amazon like strategy where you start with commerce and only introduce a marketplace later may be the winning model.

So what would a commerce model a la Amazon look like in healthcare? It would be a branded service that provides diagnosis, prescription and if necessary referral. The service would use some combination of texting, possibly video chats / image uploads, and use of existing lab networks (eg for blood analysis). The logical entry point would either be primary care in its entirety or a large specialty such as dermatology. I believe the right service could quickly grow large, especially if it can be priced in a way where insurance reimbursement becomes a secondary consideration. The subsequent marketplace would be for cases that require a specialist or treatment other than prescriptions.

I am curious to hear from others whether they buy this argument about a commerce model coming first or think we will go straight to a marketplace. Also, if you know of any startups pursuing the commerce model please let me know.

Posted: 3rd March 2014Comments
Tags:  healthcare strategy startups

Angel Investing

Later this morning I am participating in a panel called “Angel Investing in Action” at the 2014 Pipeline Fellowship Conference here in NY. The goal of the Pipeline Fellowship is to have more women angel investors (currently only about 1 in 5 in the US). This is a terrific endeavor as it will ultimately also help women entrepreneurs raise angel financing more easily. in preparation it occurred to me that while I had blogged tangentially about angel investing I had never written a dedicated post.

Given that there is a lot of early stage investing these days let me first define what I consider to be angel investing: individuals making occasional investments in startups outside of their day job. In other words, I am not including anyone who invests in startups for a living (such as myself). Having a large number of active angel investors is essential to the overall startup ecosystem because it is where the earliest dollars come from.

The impact of angel investing has gone up tremendously over the last two decades as the cost of starting a business has plummeted. Many new products or services can be launched entirely on angel dollars alone. This impact has been further amplified by better infrastructure for angel investing such as standardized documentation and market places including AngelList and CircleUp (a USV portfolio company).

What then are some of the things you should pay attention to if you are considering angel investing?

  1. Only invest money you can afford to lose. This is true not just on a per deal basis but also in the aggregate!

  2. Think of more deals as more risk (not as diversification). To get to diversification you would have to do a great many deals (dozens if not more) and not make systematic mistakes.

  3. Avoid uncapped convertible notes (if you don’t know what that means, make sure to read up on it before your first investment). They tend not to provide adequate reward for the risk you are taking.

  4. Don’t back someone who hasn’t quit their job yet. Your money is committed — they should be too.

  5. It’s OK to invest in friends. But back only friends who you think are entrepreneurs. If you have doubts be a real friend and say so.

  6. Don’t make entrepreneurs jump through hoops (leave that for the VCs — just kidding). As an angel you should ideally be able to give an entrepreneur an answer after a single meeting.

  7. Know what makes for a strong entrepreneur and (ideally) invest in areas you understand.

  8. Say “no” a lot more than you say “yes” — embedded in this is that you need to be seeing a lot of different startups.

Like all rules, these are meant to be broken. Just when you break one of them make sure you understand that’s what you are doing (and that this likely further increases the already high risk).

Posted: 28th February 2014Comments
Tags:  Angel investing

Jury Duty (Follow Up)

I posted yesterday in Homeschool Wednesday about my first time at jury duty. Here is a quick follow up. It turns out I wasn’t chosen for either of the two trials that were under consideration and surprisingly wasn’t asked to come back today. So the only comments and observations I have are about the selection process itself.

There is lots of room to use technology to improve this process. In particular, when you are first selected, you should then fill out a juror questionnaire online  that includes questions about events that you might not be able to move for the time period under consideration. That would make it much easier to review those reasons upfront (some are clearly more legitimate than others) and to better match between potential jurors’ availability and the expected length of trials.

For instance, I could have easily participated in a one week trial now but not a one month or longer trial. At some other time I might be able to do a longer trial. This could have been determined prior to my showing up yesterday. Instead pretty much all of the day was filled with moving between buildings and/or rooms, only to determine then that I and many others were not available at this particular point for a lengthy trial. If that had been known upfront the time could have been spent on actual jury selection.

This inefficiency in turn makes me wonder whether trials really need to last more than one month under any circumstance. Would love to hear about that from any lawyers reading this or people who have been on longer jury trials (apparently the longest jury trial in US history lasted 13 months). 

Posted: 27th February 2014Comments
Tags:  jury duty civics

Homeschool Wednesday: Jury Duty

I became a US citizen in the mid 2000s (I have forgotten the exact year) and recently got called for the first time for jury duty. I am very excited about this as it is both a central part of the functioning of the judicial system (and hence civil society overall), as well as a great personal and family learning opportunity. I am writing this blog post in the waiting room seeing if I will be called for a jury selection.

Germany, where I grew up, doesn’t have a jury system. Instead, judges not only run the proceedings in the court room but also decide the verdict and pick the sentence. This might seem to make the judge way more powerful than in the US but this power is heavily circumscribed by more detailed laws and guidelines. Judges in Germany are civil servants employed by the states (there are also some federal judges). In criminal cases in Germany the government is represented by a so-called “Staatsanwalt” (literally “states attorney”), who has to have the same qualifications as a judge. The system is also less “adversarial” in that these prosecutors tend to be much more limited by the law than in the US.

Given this background, I have always been intrigued by the US jury system. It seems to me that it can work only if citizens really take this duty seriously (much like voting in elections). So while I postponed my jury summons once, I am now excited to be here. I will provide an update on what happened next week (obviously without anything that I cannot legally write about).

Posted: 26th February 2014Comments
Tags:  homeschooling jury duty

Tech Tuesday: Complexity Classes

The last few weeks in Tech Tuesday we have looked at various aspects of computational complexity. So far whenever we have looked at a sample problem, such as finding words on a web page, we have studied the characteristics of a particular algorithm for solving the problem (in many cases starting out with looking at a brute force solution). But we have skirted the central question of computational complexity: how hard is the problem? Or asked differently, for a given problem, what is the best we can do?

Much of the really hard theoretical work that has gone on in computational complexity centers around assigning problems to complexity classes. A class is defined roughly as:

the set of problems that can be solved by an abstract machine M using O(f(n)) of time or space, where n is the size of the input

where I have linked to the previous posts describing the relevant concepts. The basic idea is that we represent computation by abstract machines (which are analyzable) and then determine how much time or space is required by the best possible solution on that machine. While this is easy to state, finding best solutions and proving that they are best is hard. As we will see in just a second it turns out to harbor one of the great unsolved problems of computer science.

This definition of complexity classes gives rise to a hierarchy which is shown in the following picture (taken from a Penn State class on the theory of computation):

image

What does this mean? Let’s start at the outermost class of all recognizable problems, also known as decision problems. In our study of the halting problem, we have seen that some problems are simply undecidable. So that’s why in this chart the class of decidable problems is contained strictly inside the class of all problems. This is something we know and has been proven.

Now you see lots more complexity classes nested inside of the decidable problems, such as EXPSPACE, which is the set of all problems decidable by a Turing Machine in O(2^p(n)) tape space where p(n) is a polynomial of n (which is the size of the input). Now the diagram shows this as being strictly larger than EXPTIME, which is the set of problems decidable in exponential time. That, however, so far is only a conjecture. We believe that there are problems that are in EXPSPACE but not in EXPTIME but we don’t have a proof of that.

We do, however, know that EXPSPACE is a strict superset of PSPACE. By now you should be able to guess what the definition of PSPACE looks like: it is the set of problems decidable by a Turing machine in O(p(n)) of space.

Here is something else we know and have already encountered in Tech Tuesday. We saw that the class of regular problems, which can be solved using finite state machines, is strictly smaller than the class CFL, which are the context free languages.

So we have a fair bit of knowledge at the outermost layers of this hierarchy and also at the innermost. But there is a huge gaping hole in the middle where a great many problems reside. In particular you will almost certainly have heard about the question whether or not P = NP. This is considered to be the central unsolved question in computer science. We will take this question up next Tuesday!

PS There are many more complexity classes than are shown in the diagram above. You can find a comprehensive listing in the Complexity Zoo.

Posted: 25th February 2014Comments
Tags:  tech tuesday complexity

Newer posts

Older posts