Uncertainty Wednesday: PSA Test Example

After a 2 week hiatus for travel here is another Uncertainty Wednesday. After going through probability I had promised a concrete example. As a 50 year old male one that is close to home is prostate cancer. The system in question is the prostate and simplistically it can be in two states (A) healthy and (B) with cancer. I am saying simplistic because there are various severities of prostate cancer in terms of aggressiveness and whether or not the cancer is still contained in the prostate. 

This is an example in which the state could in principle be determined directly but only at a prohibitively high cost: you could biopsy the prostate in detail but that’s of course massively invasive. So instead we rely on a signal in the form of the Prostate-Specific Antigen (PSA) level. Again simplifying somewhat the PSA level signal can be either low (L) or high (H). PSA exists in the male body to help sperm flow freely and is generally contained inside the prostate. In case of cancer though, more PSA can be released into the blood serum where it can be detected. So this fits our setup of two states and two signal values pretty well.

What then are the probabilities involved here? As you recall we need four different ones corresponding to each elementary event, such as P({AH}), the probability that you are cancer free (state A) and yet the PSA signal level is high (H). I have derived the probabilities from information about the test as well as information about the incidence of prostate cancer for a 50-year old. I will explain in a subsequent post how I did that.

P({AL}) = P(healthy *and* low PSA) = 0.907179
P({AH}) = P(healthy *and* high PSA) = 0.089721
P({BL}) = P(cancer *and* low PSA) = 0.001519
P({BH}) = P(cancer *and* hight PSA) = 0.001581

Now that we have these values we can ask the all important question: if my doctor on my next visit finds that I have a high level of PSA, how likely is it that I actually have cancer?

Let’s first introduce a bit of notation that captures this question. What we are looking for is P(B | H) this is read as the probability of state B *conditional* on signal H being observed. Or in this specific case, the probability (how likely is it?) that I have cancer (B) given that we have observed my PSA level is high (H).

Before we derive this, a quick reminder of something we have already learned. We know that we can simply add up the probabilities of elementary events, for instance

P(B) = P(cancer) = P({BH, BL}) = P({BH}) + P({BL}) = 0.001519 + 0.001581 = 0.0031

As a 50-year old I have a 0.0031 (or 0.31%) probability of having prostate cancer. This corresponds to the value found in row 2 of the CDC table on prostate cancer risk (50 years old is same as 10 years after age 40 …)

Similarly we can figure out what the probability is of observing a high PSA level.

P(H) = P(high PSA level) = P({AH, BH}) = P({AH}) + P({BH}) = 0.089721 + 0.001581 = 0.091302

So there is a probability of 0.091302 or 9.13% of observing a high PSA level – that is nearly 1 out of 10 times the signal is observed.

We are beginning to see that there may be an interesting issue here. There is only an 0.31% probability that I have cancer, but there is a 9.13% probability that we observe a high PSA signal. So now back to that all important question: how likely is it that I have cancer *given* that a high PSA was observed?

How can we answer this question from the elementary probabilities? This will be the subject of next week’s post. You should try to find the answer for yourself in the meantime. Please feel free to post it as a comment.

Loading...
highlight
Collect this post to permanently own it.
Continuations logo
Subscribe to Continuations and never miss a post.
#uncertainty wednesday#example#cancer test