Uncertainty Wednesday: Entropy

Today in Uncertainty Wednesday I want to start on the idea that just the shape of the probability distribution alone contains some measure of uncertainty. Let’s think about the simplest case again with just two states A and B and P(A) = p1 with P(B) = p2 = 1- p1. I am using indices because we will shortly expand the number of states.

If p1 = 1 there is no uncertainty at all because you are certain that the world is in state A. Same holds true for p1 = 0 except that it is now state B (because now p2 = 1). If we let p1 move continuously from 1 to 0, uncertainty first increases and then it starts to decrease again. As we will see later but should be intuitive, uncertainty is maximized for p1 = p2 = 0.5.

More generally for a probability distribution p1, p2, … pn we would like to have our measure of uncertainty such that

U(p1, p2, … pn) is continuous in p1, p2, … pn

Meaning if you make an infinitesimally small change in two of the p’s (remember, you can’t just change one of them because they all need to add up to 1), you get only an infinitesimally small change in U.

Now compare the following two situations to each other. You can either face two equally likely states A and B or three equally likely states A, B, C. It seems intuitive that we would say that there is more uncertainty when there are three equally likely states, even if we know nothing else. This requirement can be expressed as follows (with some abuse of notation):

f(n) := U(1/n, 1/n … 1/n) is monotonically increasing in n

Finally, a good measure of uncertainty will have a straightforward approach to composability, meaning if you first face one uncertainty and then a second one it should be easy to combine the uncertainty measure for each to get an overall uncertainty.

To make this more concrete imagine the following setup. There are four states of the world A, B, C and D. Now imagine that the true state of the world is revealed to you in two stages: first you find out if is in {A, B} or {C, D} and then you find out the actual state. Let’s call the fist step X and the second step Y. Then we would like our measure of uncertainty to behave as follows

U(XY) = U(X) + ∑P(Xi) * U(Y|Xi) where ∑ is over the elements of X

and where U(Y|Xi) is the measure of remaining uncertainty conditional on the outcome of the first step. What this requirement amounts to is saying that the total uncertainty is a probability weighted sum of the uncertainties of each step (the first step having probability 1).

In his groundbreaking 1942 paper “A Mathematical Theory of Computation” Claude Shannon showed that the only measure of uncertainty that fulfills all three of these requirements is

H = - K  ∑ pi log pi   where  ∑ is over the i = 1…n of the probability distribution

which is known as the Shannon entropy or just entropy of the probability distribution.

It is important to emphasize again that this is a measure of uncertainty that operates solely at the level of the probability distribution. Nothing in its definition refers to outcomes or even further to the impacts of outcomes on different actors. See the Intro to Measuring Uncertainty from two weeks ago for an explanation of the difference.

Next week we will look at entropy for some probability distributions to get more of a feel for what this measure captures.

Posted: 31st May 2017Comments
Tags:  uncertainty wednesday probability distribution entropy Shannon

Basic Income: The Potential of Cryptocurrency

In my book World After Capital, I have a chapter on Universal Basic Income, which I see as essential to getting from the Industrial Age into the Knowledge Age. Basic income gives people economic freedom, which is essential if we want them to freely allocate their attention. I am about to rewrite the part on how to finance a Basic Income. Much of the writing on that to date, including my own, has taken the approach of looking at existing budgets and figuring out how to rearrange them. That, however, is thinking too narrowly.

Instead, I am now convinced that the right way to implement a Basic Income is through changing how money is created. At present most industrial economies use some form of fractional reserve banking. Commercial banks can create extra “money” in the economy in the form of credit as they only need to keep a fraction of their deposits as a reserve. Central banks have also used other mechanisms to provide liquidity to commercial banks, especially following the 2008 financial crisis.

An alternative approach would be to move money creation the individual level by issuing a basic income. This is variously referred to as helicopter money and quantitative easing for the people. Now one immediate objection to such a policy would be that it could lead to runaway inflation a la the Weimar Republic. Technology, however, is massively deflationary providing a strong counter forces. And money creation at the people level could be combined with some form of demurrage to withdraw money from the system.

There have been small local currency systems along these lines, the most famous one be the Woergl Experiment. I am grateful to team at Mein Grundeinkommen, which has introduced me to much of the literature on this. One analogy which they use is that money in this system is like water: basic income is rain and demurrage is evaporation (which together form the basis of the water cycle).

One exciting potential of crypto currencies is that they could make it much easier to build such a system. This could happen at the existing nation state level and I was pleased to find out that some central banks are actually thinking along this direction. Even more tantalizing is the idea that with crypto currencies such as system could come into existence that lives outside the confines of nation states and helps bring about a global Universal Basic Income.

Posted: 30th May 2017Comments
Tags:  basic income cryptocurrency

Some Lessons I Learned from the Dotcom Bubble for the Coming Crypto Bubble

I spent a bunch of time at Consensus and Token Summit this week. If it wasn’t clear before, we are headed into a crypto currency bubble. Now a bubble isn’t in and of itself a bad thing. In fact almost every wave of technology has brought with it a financial bubble phase (eg care for Railway Mania in Britain). During this phase a lot of financial capital flows into a sector which finances real innovation and accelerates the buildout of physical infrastructure. The same will be true for the crypto currency bubble and the buildout of the decentralized Internet (see my recent quote in the Economist).

Now what was eye opening to me is how young most of the players in the crypto currency space are. Many of them were not around as either entrepreneurs or investors in the Dotcom Bubble. So as someone who was, I thought I would share some lessons learned.

1. Retain your critical faculties. After the Dotcom Bubble was over, looking back much of it seemed like a fever dream. You get swept up in it, are surrounded by others who are as well, and a powerful internal logic takes hold, where everything is evaluated only in relation to other parts of the bubble and not the world at large. So: figure out how to clear your head. Take a vacation if necessary.

2. Beware vanity metrics. Because of the focus on internal logic that I was describing above it is especially important to look beyond vanity metrics (eg in the Dotcom Bubble it was pageviews or worse yet cumulative registered users). Instead to get a sense of where we are, look for metrics of actual adoption by endusers.

3. Make yourself antifragile. This is maybe the single most important lesson. Do not, under any circumstance, borrow to invest in a bubble. You will be in an extremely fragile position if you do because of violent prices movements. Realize that you can be borrowing implicitly if you use (smart) contracts that short the market or have leverage built into them. Instead, make sure you have plenty of dry powder / segregated funds available for the time after the bubble. The best way for doing so is to intentionally take some money off the table on the way up (incidentally, the more people do that, the less of an extreme run up there will be).

4. Watch out for outright scams. One of the great things about a bubble – see my intro paragraph – is that it finances many experiments. And as society we need a lot of experiments to see what works. Most of the people I have met in this space are genuine in their desire to build a decentralized world that empowers the network participants (over the network operator). But there are and will be more people who see this as a get rich quick opportunity. As I said at Consensus: Buyer Beware!

I would be remiss if I didn’t also point out one mistake in the other direction. Not getting involved because you think it’s a bubble that will be over shortly. Nobody can really predict timing (somebody will always be right after the fact). If you believe in the importance of decentralized systems, get involved now and use the advice above to put yourself in a position where you can stay involved for the long run.

Posted: 26th May 2017Comments
Tags:  crypto currency bubble lessons

Uncertainty Wednesday: Probability Distribution

So far in Uncertainty Wednesday we have limited ourselves to looking at examples with only two states of the world and two possible signal values. When I introduced this I explained that these combine to form four elementary events and we looked at the basic requirements for assigning probabilities to these and the axioms that probability should then follow. 

Now let’s forget for a moment about the origin of our elementary events and simply look at any set S = {A, B, C, D, E, F} where the members are elementary events, or states of the world, or signal values. Now a probability distribution across the set S, is simply a set of values such that for x ∈ S

0 ≤ P(x) ≤ 1

and

∑P(x) = 1 where the sum is over all x ∈ S

Quite clearly there are infinitely many probability distributions that are possible (this was also already true in the case where S has only two elements). But how many P(x) can be chosen “freely”? Well, if |S| = n, meaning the set S has n members, then we get to choose n - 1 probabilities and the last one is automatically determined by the requirement that they all sum up to 1. In the case of n = 2 there is only one free parameter, i.e. is S = {A, B} and P(A) = p, then automatically P(B) = 1 - p.

So take for a moment the case of |S| = 1000, well then there are 999 probabilities that could be different individually and only the last one is then determined. We can think of this as a 999-dimensional space. I have intentionally chosen letters as the elements of S and even that is suggesting too much structure for the most general version (because we think of the alphabet as ordered).

Why am I emphasizing this? First, because most probability distributions that we work with all the time, such as the normal distribution impose dramatic constraints. For starters, these distributions require an ordering of the state space (meaning an ordering of the elements of S). And then they collapse the number of free dimensions dramatically. In the case of a normal distribution, for example, we will see that there are only 2 parameters (the mean and the standard deviation). So in the case of |S| = 1000 just discussed, by imposing a distribution that is approximately normal, we reduce a 999-dimensional space to a 2-dimensional one!

Second, because it is possible to make some important arguments about uncertainty solely on the basis of the shape of the probability distribution without a reference to values associated with each element of S. Coming back to the simplest possible case of |S| = 2, there is a difference in uncertainty between P(A) = 0.8 and P(A) = 0.5 (and hence P(B) = 0.2 and 0.5 respectively). In a very precise way there is more uncertainty when P(A) = 0.5 than P(A) = 0.8 without saying anything about what happens in each state or assigning a numeric value to the states.

In the coming Wednesdays we will dig deeper into both of these important points, probably starting with the second one (although I haven’t quite made up my mind about that).

Posted: 24th May 2017Comments
Tags:  uncertainty wednesday probability distribution

Who Controls our Attention? Separating Aggregation/Discovery from Publishing

This weekend brought us both a big New York Times piece titled “The Internet is Broken: @ev​ Is Trying to Salvage It” and a Guardian article about Facebook’s guidelines for content moderators. Both speak to the question of who controls content on the Internet and how that control impacts our attention. Ev makes the point that extremes attract more attention compared to a more moderate or balance view. Therefore advertising based systems, which require attention which they then partially resell (that’s the fundamental nature of advertising), are unlikely to ever be the best guardians of human attention. One alternative are subscriptions, but for that to really work for news, commentary, or even just friends’ status updates we need to separate aggregation/discovery from publishing. 

One of the great breakthroughs of the web was that it brought us permissionless publishing. We no longer required a gatekeeper publisher but instead could each have our own website or blog. And initially that’s how the web grew. Millions of individual sites. But there was a content discovery problem in this fragmented world. This initially gave rise to aggregators. Eventually though, a new model emerged that re-centralized the web by recombining aggregation/discovery with publishing. Ev was in fact the founder of three such integrated platforms: Blogger, Twitter and Medium. Facebook, the largest platform on the web today is of course another example that combines these functions (and at USV we have been investors in Twitter and Tumblr which both did this).

Why keep aggregation/discovery and publishing separate? For two reasons: first, it does not provide a central point for censorship. Second, it allows for aggregators/discovery solutions to compete with each other. In such a world we would not need to debate whether Facebook’s policies are the right ones. Instead, all content could get published and you can choose an aggregator that best reflects your own needs and values. 

It is possible to make a subscription model work in that world also. Blendle and Scroll are both working on that for larger publishers. Patreon is an alternative subscription model focused on directly subsidizing longtail content creation. I would also happily subscribe to Techmeme, if Techmeme kept only a small percentage of that for its aggregation services and passed the rest on to the content creators (on the basis of which content I click through from Techmeme). Same goes for Nuzzel.

Many of the investments we have made recently at USV are aimed at building a decentralized infrastructure for publishing and aggregation/discovery. For instance, Brad recently announced our investment in Protocol Labs which has developed the IPFS protocol and is now working on Filecoin. And our portfolio company Blockstack has developed a namespace and storage system that can be used to provide identity in a decentralized publishing world.

So yes, the Internet may be broken, but the way to “salvage” it is undo the bundling of publishing and aggregation/discovery through decentralized systems.

Posted: 22nd May 2017Comments
Tags:  re-decentralize publishing aggregation discovery

Vocational Schools and Getting to the Knowledge Age: Finding a Calling

I recently listened to a talk by David Autor about employment and technology. He praised the high school movement as being incredibly forward looking for addressing the rapid decline in farm employment. When asked about what the equivalent would be today, David answered “vocational schools.” Now as it turns out my Dad was a teacher at a vocational school in Germany where the apprenticeship system has been maintained throughout the Industrial Age. I learned a ton from my Dad about technical drawings and the workings of drills and lathes.

Based on observing my Dad and his students, as well as my thinking and writing in World After Capital, I think David is right about “vocational” but in the original sense of the word. Vocational comes from Latin “vocatio” which means a call or summons. What we need “vocational” school to be is a way of finding one’s calling. A calling is very different from a job or even a career. It is about one’s purpose in life instead of being about earning an income. Humans have a profound need for purpose which will not go away even once we automate many jobs. Schools must help people find their purpose (incidentally that was the role of the practical philosophical schools in ancient Greece).

There is an important analogy here with the transition from the Agrarian Age to the Knowledge Age. In many countries we went from having 50 - 75% of the workforce in agriculture to 5% or even less. Now fast forward say 80-100 years into the Knowledge Age. I believe we will see the same trend for all workforce activity. Humans will be just as busy as before but much of that will be in the realm of voluntary, purpose driven activity. Conversely the workforce activity, as in selling labor for money, can become a small fraction (sub 20%) of all human activity.

I know such a change seems extremely hard to fathom, much like imagining the Industrial Age from the Agrarian Age must have been difficult. And we will not get there automatically through some deterministic force of technology, history or economics. Instead we have to want it. One place to start with that is schools and what they teach. And in that regard David Autor is exactly right (although he meant it differently). The high school movement had a vision of the future. It is time for a new one. For a World After Capital. And “vocational” as in calling is exactly the right idea. That has been the driving force for why Susan and I chose to homeschool our children. If you want a model for the “vocational school” of the future, look at today’s homeschoolers.

Posted: 19th May 2017Comments
Tags:  future education learning employment vocational homeschool

Uncertainty Wednesday: Intro to Measuring Uncertainty

We have covered a lot of ground in Uncertainty Wednesday but we have yet to talk about measuring uncertainty. While most discussions introduce such concepts as mean and standard deviation early on, I have held off on them on purpose in order to develop a more comprehensive view of the sources of uncertainty and a hopefully better understanding of probability. Now is a good time to start thinking about how we might measure uncertainty.

Take our super simple model again of a world with only two states and two signal values. Now let’s think about the factors that go into uncertainty.

The first factor is the probability between the two states. If there are only two states and one state is extremely likely, then you face less uncertainty than if each state is equally likely. So that’s one aspect of uncertainty that we will want to make precise. It would also seem that if there are more than 2 possible states of the world that should increase uncertainty, e.g. a world with 100 states would seem to have more uncertainty in it than one with only 2. In upcoming posts we will spend a fair bit of time looking at different so-called probability distributions to study their characteristics and impacts on uncertainty. And as we have already seen when you receive a signal, you can use that to revise your estimate of probabilities in a way that reduces uncertainty. 

The second factor are the outcomes for you that result from the different states of the world. A classic example is where in one state you gain money and in the other you lose money. For instance, you drill a hole in the ground and either oil comes out or it doesn’t. Because monetary outcomes are often of interest we may sometimes refer to these outcomes as “payouts.” More generally outcomes are the concept of a random variable, meaning a variable (such as the money paid or received) that takes on different values with different probabilities.

It is important to understand that these are two separate factors impacting uncertainty. Suppose you can either win $1 or lose $1. We can examine the impact on the uncertainty you face from changes in the probability between the two states of the world. Now conversely, let’s hold the probability fixed, say one state has 80% probability and the other has 20%. We can then examine the impact on the uncertainty you face from changes in the payouts between the two states.

But wait, there is more. Payouts are only the immediate outcomes. The value or impact of these payouts may be different for different people. What do I mean by this? Suppose that we look at a situation where you can either win $1 million with 60% probability or lose $10 thousand with 40% probability. This seems like a no brainer situation. But for some people losing $10 thousand would be a rounding error on their wealth, whereas for others it would mean becoming homeless and destitute. So even though the the probabilities (factor 1) and the payout (factor 2) are the same, the uncertainty that is faced can be quite different between people. So the third factor influencing uncertainty are the consequences or utility that results from different outcomes. This important difference goes by the prosaic name of “functions of a random variable.”

So from this intro alone it should be clear that it is unlikely that we can find a single all encompassing measure of uncertainty. Instead, as we will see over the coming posts there are measures and results that are associated with each of these three factors.

Posted: 17th May 2017Comments
Tags:  uncertainty wednesday measuring uncertainty

Thoughts on Regulating ICOs

It appears that regulators are looking into ICOs, the Initial Coin Offerings by which many projects are raising financing at the moment. Regulators are rightly concerned with people losing money in speculative projects that fail to deliver or worse yet in outright scams. As regulators think about how to approach this area it will be important to keep a few things in mind:

1. Coins are sui generis. Coins are not equity, although they share some aspects of equity. Coins are not a pre-purchase, although they share some aspects of a pre-purchase. Coins are not currencies, although they share some aspect of currencies. Etc.for other currently regulated securities. Applying an existing regulatory framework to coins will be detrimental since it will not fit their unique characteristics.

2. Coins are global. This is not a US phenomenon. The issuance and purchase of coins is happening all around the world. A US only approach to regulation could easily be detrimental only to US-based projects and purchasers.

3. Coins represent innovation. Coins are both a technological and a financial innovation. They are integral to decentralized protocols and they allow for a novel way to finance the creation and operation of these protocols. In the early days of the web regulation took a pro-innovation stance, for instance by not having a sales tax on the internet.

4. Coin investors have a large risk appetite and tolerance. While this is not true for everyone, many of the people buying in ICOs are doing so with gains from investment in Bitcoin and Ethereum. They are almost by definition early adopters of technology.

Regulators need time for fact finding, including developing an in-depth understanding of how different coins operate and the range of projects that exists. While they do so, regulators could issue and publicize an immediate warning to consumers that ICOs are speculative, and require US-based ICOs to prominently display that warning, to reduce time pressure on coming up with the right regulatory approach.

Posted: 15th May 2017Comments
Tags:  tokens coins crypto regulation

DREAMELIA

I am super excited to be backing our daughter Katie’s film project on Kickstarter. She is making a horror movie about what it’s like to be a teenager. Which sounds about right. Joking aside, here’s the premise:

At sixteen, Amelia is almost free of the torment of her cutthroat New York City private high school. To her little group of friends, she’s the perfect student, bound for a prestigious Ivy League college, but beneath the surface things are bubbling over. Since her mother’s death, Amelia’s father’s has buried himself in his work and left Amelia to raise herself. School is her escape… until she’s roped into a cheating scandal with the richest kid at school.

Innocent and facing repercussions, Amelia agrees to counseling and is prescribed Syrenaphyn, a powerful anti-depressant that transports Amelia into a new world of ease. While Amelia sees this new world as beautiful, we see the dark, warped and distorted truth. This film looks at the allure of the escapism of prescription drugs through a completely new lens.

The reason I am so excited about this is because the movie combines both of Katie’s interests: psychology and filmmaking. When we switched to homeschooling years ago, the premise was that it would allow our children to have more time to discover their interests. Finding interests that are your own is hard. I was thrilled when after about a year of homeschooling, Katie came to us and asked if it was OK for her to watch and review every Wes Anderson movie.

Katie has since taken a number of filmmaking classes and directed a number of shorts, including a music video with her siblings. She has also become really interested in psychology and just took Intro to Psychology at Columbia (thanks to an incredible program that allows high school age kids, including home schoolers to take classes there).

If you like the backstory and premise, please help Katie fund DREAMELIA on Kickstarter.

Posted: 12th May 2017Comments
Tags:  movies kickstarter homeschool

Uncertainty Wednesday: Day Off for USV Portfolio Summit

I am taking the day off from Uncertainty Wednesday because today is our portfolio company summit. It is a day I look forward to every year where many of the founders and CEO of our portfolio companies come together. 

We have been holding this unconference style for many years. The day maximizes the room for small group conversations around topics that have been generated by the participants. The net result is a high quality and open exchange of experiences and opinions, which contributes to strong relationships that last through the year. One of the key messages we want people to take away from is that they are not alone but have a strong peer group they can access any time.

There will also be a couple of topics that we will dig into with the entire group. Two years ago we covered blockchain. Last year we had an extensive discussion about privacy. This year we will talk about building organizations that foster grit. We will also spend time on the importance of Machine Learning for all companies, not just those that are offering a Machine Learning product or service.

P.S. Uncertainty Wednesday will resume next Wednesday. We will look at how to measure uncertainty.

Posted: 10th May 2017Comments
Tags:  usv

Older posts