Philosophy Mondays: Human-AI Collaboration
Today's Philosophy Monday is an important interlude. I want to reveal that I have not been writing the posts in this series entirely by myself. Instead I have been working with Claude, not just for the graphic illustrations, but also for the text. My method has been to write a rough draft and then ask Claude for improvement suggestions. I will expand this collaboration to other intelligences going forward, including open source models such as Llama and DeepSeek. I will also explore other moda...

Intent-based Collaboration Environments
AI Native IDEs for Code, Engineering, Science
Web3/Crypto: Why Bother?
One thing that keeps surprising me is how quite a few people see absolutely nothing redeeming in web3 (née crypto). Maybe this is their genuine belief. Maybe it is a reaction to the extreme boosterism of some proponents who present web3 as bringing about a libertarian nirvana. From early on I have tried to provide a more rounded perspective, pointing to both the good and the bad that can come from it as in my talks at the Blockstack Summits. Today, however, I want to attempt to provide a coge...
Philosophy Mondays: Human-AI Collaboration
Today's Philosophy Monday is an important interlude. I want to reveal that I have not been writing the posts in this series entirely by myself. Instead I have been working with Claude, not just for the graphic illustrations, but also for the text. My method has been to write a rough draft and then ask Claude for improvement suggestions. I will expand this collaboration to other intelligences going forward, including open source models such as Llama and DeepSeek. I will also explore other moda...

Intent-based Collaboration Environments
AI Native IDEs for Code, Engineering, Science
Web3/Crypto: Why Bother?
One thing that keeps surprising me is how quite a few people see absolutely nothing redeeming in web3 (née crypto). Maybe this is their genuine belief. Maybe it is a reaction to the extreme boosterism of some proponents who present web3 as bringing about a libertarian nirvana. From early on I have tried to provide a more rounded perspective, pointing to both the good and the bad that can come from it as in my talks at the Blockstack Summits. Today, however, I want to attempt to provide a coge...
>400 subscribers
>400 subscribers
Share Dialog
Share Dialog
Today’s Uncertainty Wednesday continues our exploration into p-values and why they are problematic. Last Wednesday we saw that if you have incentives to reject a null hypothesis, it takes less work than you would initially think to find data that gets you there. I ended that post suggesting that the problem is even bigger than that. How so?
We now live in the age of “big data” – researchers in many fields have access to massive data sets. This lends itself to an approach that has become known as “data dredging.” Instead of starting with the null hypothesis of a “fair and independent coin” we start with a large database of pre-recorded coin flips. Now we work backwards to find a hypothesis that we can reject with a p-value of 0.05 or maybe even 0.01 in our data set!
How would we do such a thing and what would such a hypothesis look like? Well with a dataset containing just Hs and Ts we would have to be a bit creative. But we could generate hypotheses that take the form of a probabilistic finite state machine. For instance: the coin first has a probability of 20% H and 80% T, if H it has a subsequent probability of 70% H again, but if T then it only has a 10% of repeating T. You get the idea. You could write computer code that generates such hypotheses until you find one that you can reject with a really significant p-value in your dataset. Then you go and publish!
Now you might object: Albert, these are completely arbitrary hypotheses, why would anyone believe these? Well, they only come across as arbitrary because I on purpose stayed within the domain of a coin flip. But most big dataset are really complex containing many different variables. Just take the coin flip database and combine it with a database of stock price fluctuations. Now you can test tons of different hypotheses of the form: price movements for stock x are not correlated with the coin flips (where H might be stock price for x moves up and T it moves down).
Again you can have your computer generate these hypotheses for you and test them until you find one you can reject with a p-value that’s deemed significant. These hypotheses are just as arbitrary as the coin state machines I suggested above, but they don’t look that way. They look really simple and thus credible.
But this approach completely violates the statistical reasoning behind p-values. That reasoning only applies if you start with the hypothesis and then apply the test. In any large dataset you will always be able to work backwards towards hypotheses that can be rejected *in that dataset*. Just recall the prior posts about spurious correlation.
OK, so that’s pretty bad given that so many people have incentives to find hypotheses they can reject so that they can publish a paper or claim that a product is effective. But next Wednesday we will look into an even more profound problem with p-values.
Today’s Uncertainty Wednesday continues our exploration into p-values and why they are problematic. Last Wednesday we saw that if you have incentives to reject a null hypothesis, it takes less work than you would initially think to find data that gets you there. I ended that post suggesting that the problem is even bigger than that. How so?
We now live in the age of “big data” – researchers in many fields have access to massive data sets. This lends itself to an approach that has become known as “data dredging.” Instead of starting with the null hypothesis of a “fair and independent coin” we start with a large database of pre-recorded coin flips. Now we work backwards to find a hypothesis that we can reject with a p-value of 0.05 or maybe even 0.01 in our data set!
How would we do such a thing and what would such a hypothesis look like? Well with a dataset containing just Hs and Ts we would have to be a bit creative. But we could generate hypotheses that take the form of a probabilistic finite state machine. For instance: the coin first has a probability of 20% H and 80% T, if H it has a subsequent probability of 70% H again, but if T then it only has a 10% of repeating T. You get the idea. You could write computer code that generates such hypotheses until you find one that you can reject with a really significant p-value in your dataset. Then you go and publish!
Now you might object: Albert, these are completely arbitrary hypotheses, why would anyone believe these? Well, they only come across as arbitrary because I on purpose stayed within the domain of a coin flip. But most big dataset are really complex containing many different variables. Just take the coin flip database and combine it with a database of stock price fluctuations. Now you can test tons of different hypotheses of the form: price movements for stock x are not correlated with the coin flips (where H might be stock price for x moves up and T it moves down).
Again you can have your computer generate these hypotheses for you and test them until you find one you can reject with a p-value that’s deemed significant. These hypotheses are just as arbitrary as the coin state machines I suggested above, but they don’t look that way. They look really simple and thus credible.
But this approach completely violates the statistical reasoning behind p-values. That reasoning only applies if you start with the hypothesis and then apply the test. In any large dataset you will always be able to work backwards towards hypotheses that can be rejected *in that dataset*. Just recall the prior posts about spurious correlation.
OK, so that’s pretty bad given that so many people have incentives to find hypotheses they can reject so that they can publish a paper or claim that a product is effective. But next Wednesday we will look into an even more profound problem with p-values.
No comments yet