Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian

Evaluating Diagnostic
Accuracy of Prostate Cancer
Using Bayesian Analysis
Part of an Undergraduate Research course
Chantal D. Larose
Three Ingredients to the Analysis
ROC Curves
Bayesian Analysis
• Prostate cancer is the most
common non-skin cancer in
▫ The risk of being diagnosed
with prostate cancer increase
with age. While 1 in 10,000
men under 40 years of age are
diagnosed, the rate increases to
1 in 15 in men over 60.
Normal Prostate
Prostate Cancer
• America’s population is aging
▫ In 2000, persons over the age of 65 made
up 12.4% of the country’s population. The
proportion is expected to increase. That
means an increasing proportion of the
population is at risk for prostate cancer.
• Accurate testing could help thousands of
Three Ingredients to the Analysis
• Test - A simple, non-invasive procedure
• Blood test to measure Prostate-Specific Antigen (PSA)
• Gold standard - Can be a complex, expensive,
• Determines the presence or absence of prostate
• Our gold standard is a biopsy
• Covariate - Additional information
• May help us increase the accuracy of our prediction.
• Our covariate is patient age.
There are two main questions:
1. How good is the PSA test alone at
predicting the presence of prostate
2. How good is the PSA test at predicting
prostate cancer when combined with
information about a patient’s age.
ROC Curves
• Receiver Operating Characteristic (ROC)
curves plots ‘True Positive Rate’ versus
‘False Positive Rate’
▫ True Positive Rate: probability that
Positive test result is correct
▫ False Positive Rate: probability that
Positive test result is incorrect
ROC Curves
• The area under the ROC
curve (AUC) serves as a
measure of overall accuracy.
• If the AUC equals…
▫ 1, the test is perfect every time
▫ 0.5, it is as good as a coin flip
▫ 0, the diagnosis is the opposite
of the test result.
• We expect values between
0.5 and 1.
ROC curve and its AUC
Bayesian Analysis: Three Key Parts
The Prior Distribution
▫ Knowledge about the target parameter
before looking at the data.
▫ One of two general types:
• Informative prior: Holds information we know
or suspect to be true, based on previous experiments
or expert knowledge.
• Noninformative prior: Holds very little
information. Best when we do not have previous
experience or expert knowledge.
Bayesian Analysis: Three Key Parts
• The Data
▫ The data used in Bayesian analysis is
represented with a likelihood.
▫ The likelihood is combined with the prior
Bayesian Analysis: Three Key Parts
• The Posterior Distribution
▫ After combining the prior information with
the data, represented by the likelihood, the
result is the posterior distribution.
▫ The posterior represents a natural updating
of the prior knowledge based on the
information from the data.
Bayesian vs. Frequentist Analysis
• Bayesian analysis allows us to update our
initial distribution assumption to account
for the observed data.
• The analysis also provides us the flexibility
of directly analyzing the entire posterior
distribution of the target parameter.
▫ Thus, we can look at the mean, median, and
other statistics that are useful to the
questions we want to answer.
Bayesian vs. Frequentist Analysis
One key difference between Frequentist and
Bayesian methods is how population parameters
are treated.
Frequentist Approach
Bayesian Approach
considers population
parameters as fixed,
unknown constants. It is
assumed that all the
randomness lies in the data.
takes the opposite view. The
parameters are considered
random variables which
have their own distribution
of possible values. The data
is the known information.
All resulting error comes
from the distribution
Bayesian vs. Frequentist Analysis
• When is Bayesian analysis more
▫ When a parameter of a distribution is itself
a random variable, or when expert
knowledge is available
▫ Works well with small data sets, where
maximum likelihood methods are not
Bayesian vs. Frequentist Analysis
• Likely to have an idea about
the prior distribution.
• If expert knowledge is not
available, we may use a noninformative prior.
• Flexibility in choosing prior
• Works well with small data
sets, where maximum
likelihood methods are not
• The Two Bayesian Problem:
different priors produce
different posteriors.
• Addressed by noninformative
Forming an ROC Curve
• A parametric approach,
using binormal distribution
• The curve is a function of
parameters a and b.
▫ The values of a and b are
also functions of two
parameters, Beta and
▫ These values are determined
by a prior assumption and
information from the data.
The trickle-down effect
Forming an ROC Curve
• In other words, there is a
trickle-down effect from the
data and first assumptions to
the ROC curve.
• To calculate this trickle-down
effect, we use Bayesian
statistical analysis.
The trickle-down effect
• Our analysis starts with two
noninformative priors
▫ The majority of the information in the
posterior will come from the data.
• The data is organized into three groups:
▫ All patients
▫ Younger patients only
▫ Older patients only
• Each group undergoes two analyses:
▫ one uses only test data to predict the diagnosis,
▫ one uses test and age data combined.
• Each analysis produces an ROC curve. All
together, six curves are calculated.
• Each ROC curve had a section which fell below
the 50% accuracy line.
• There was very little difference between pairs of
ROC curves for a single age group
• Differences in curves between older and younger
patients were more pronounced.
▫ This makes sense. Since younger patients have
naturally low PSA levels, it is easier to detect high
levels, and elevated levels are more likely due to
• Our progress so far has brought more questions.
Among these include:
• How would we use the Bayesian technique with
an incomplete dataset (i.e. missing information
for certain patients)?
• Why does each ROC curve dip below the 50%
accuracy line?
• What effects do other covariates, such as ethnic
background, have on the ROC curve?

similar documents