Uncovering Age-Specific Invasive and DCIS Breast Cancer Rules

Report
Uncovering Age-Specific
Invasive and DCIS
Breast Cancer Rules Using
Inductive Logic Programming
Houssam Nassif, David Page,
Mehmet Ayvaci, Jude Shavlik,
Elizabeth S. Burnside
The American Cancer Society, Cancer Facts & Figures 2009.
Radiology
Report
Mammogram
Radiologist
Malignant
Yes
Abnormal
finding?
Cancer
Biopsy
No
Invasive
In Situ
Benign
Routine screening
Biopsy
• Biopsy:
– Costly
– Invasive
– Potentially painful
• Models based on mammography report and
personal data help identify pre-biopsy cancer
stage.
Cancer Stages: In Situ
Basement Membrane
• Cancer cells localized
• Did not spread
Abnormal
cells
Cancer Stages: Invasive
Basement Membrane
• Cancer cells break through basement
membrane
• Invade surrounding tissue
Abnormal
cells
Treatment
• In Situ:
– Can develop into Invasive
– Excellent prognosis, less intensive treatment
Treat it
• “Overdiagnosis” (unnecessary treatment)
• Time to spread may be long
– Patient may die of other causes
Problem Constraints
• Identify patient subgroups that would benefit
most from treatment
• Use biopsy alternatives (like follow-up)
• Help patients make informed decisions,
personalized medicine
Task Formulation
• Given:
– Radiology reports
– No biopsy
• Do:
– Identify patient subgroups
– Specify Invasive/In Situ probabilities
Data
• Match mammograms to biopsies
• 1063 Invasive, 412 In Situ cases
Radiology Report
Biopsy
- Personal/family
history
- BIRADS code
- Palpable lump
- Mass specs
- Calcifications
...
- Date
- Breast side
- Cancer stage
...
Acronyms
• BI-RADS code: Breast Imaging Reporting and
Data System
– A number (0-6) summarizing the radiologist
opinion and findings concerning a mammogram.
– In increasing probability of malignancy:
1<2<3<0<4<5<6
• DCIS: Ductal Carcinoma In Situ
– One [and only?] type of In Situ Breast Cancer
Age Matters
• Apply Logistic Regression
– Different attributes predict cancer stages in
different age groups
• Stratify data (~menopausal status):
– Older cohort (age => 65) (post-)
– Middle cohort (50 <= age < 65) (peri-)
– Younger cohort (age < 50) (pre-)
Age-Specific Attributes
• Find accurate age-specific attributes
• Inductive Logic Programming (ILP) confers
added benefits beyond Logistic Regression:
– Human comprehensible rules
– Specific data subsets
Inductive Logic Programming
•
•
•
•
Machine learning approach
White-box classifier
Constructs if-then rules
Allows user interaction using background
knowledge
• Operates on relational datasets
Example
Record
Patient
Date
BIRADS
Patient Date
10
100
08/2010
5
100
09/2010 Invasive
11
100
02/2008
3
100
03/2008 Benign
12
200
06/2009
4
200
07/2009 In Situ
• Assign mammograms to biopsies
• Discard: Record 11 since benign
Stages since target of prediction
• Non-relational learner extracts:
– BIRADS(10,100,5)
– BIRADS(12,100,4)
Stage
ILP Predicate Invention
• Link patients records, e.g:
– Old study (id, old id)
– Old biopsy (id, old id, result)
– Access old study/biopsy attributes
• Compare attributes, e.g:
– Mass size decrease (id, old id)
– This-side breast BIRADS code increase (id, old id)
Example Cont’d
Record
Patient
Date
BIRADS
Patient Date
10
100
08/2010
5
100
09/2010 Invasive
11
100
02/2008
3
100
03/2008 Benign
12
200
06/2009
4
200
07/2009 In Situ
• Link records: OldStudy(10,11)
• Access previous study predicates:
– BIRADS(11,100,3)
– OldBiopsy(10,11,Benign)
• Compare predicates:
– BIRADSincrease(10,11,3)
Stage
Methodology
Older
Cohort
Reports
Younger
Cohort
Reports
ILP
Classifier
Differential
Prediction
Invasive
v/s
In Situ
Rules
Older-Specific
Invasive/In Situ
Rules
Differential Prediction
• Limit to Older and Younger:
– Maximize age and attribute difference
– Leave-out peri-menopausal
• Define Invasive rules in Older:
– Good Invasive prediction on older
• Precision > 60%, Recall > 10%
– And significantly worse prediction on younger
• Precision difference p-value < 0.05
Invasive Rules in Older
1. The mammogram has a palpable lump in thisside breast.
2. The mammogram's indication for exam is
“palpable lump”.
3. The mammogram's indication for exam is
“palpable lump",
and its other side BI-RADS < 3,
and its mass margin is not reported.
Palpable Lump
• Higher occurrence in Younger
• Tendency in younger:
– Rapid proliferation
– Poor differentiation
– In Situ thus more likely to be palpable
• Tendency in older:
– Slow growth
– When big enough to be palpable, almost certainly
Invasive
Invasive Rules in Older Cont'd
1. The mammogram has an old-biopsy that was
invasive
2. The mammogram has an old-biopsy that was
invasive,
and the biopsy happened within the same
age group.
• Due to:
– Longer life-span of older women
– Higher recurrence of invasive tumors
In Situ Rules in Younger
1. The mammogram has a personal history of
cancer in this-side breast,
and this-side breast has a prior surgery,
and its combined BI-RADS increased by at
least 2 points compared to a previous study.
Recurrence
• A recurrence is a better predictor of In Situ in
younger
• Contrast with previous rules, where invasive
tumor recurrence is a better predictor of
Invasive in older
Other Rules
• No rules met our criteria for:
– In Situ in Older
– Invasive in Younger
• Middle cohort behavior:
– 2 rules like Older
– 2 rules like Younger
– 2 rules neither
Probabilities
Rules
Precision Precision Recall
Older
Younger Older
Recall
Younger
Palpable 1
94%
87%
42%
65%
Palpable 2
95%
86%
35%
62%
Palpable 3
98%
87%
19%
41%
Invasive Biopsy 1
97%
86%
50%
18%
Invasive Biopsy 2
100%
86%
44%
18%
Younger Recurrence
8%
67%
2%
11%
Problem Solutions
• Identify patient subgroups that would benefit
most from treatment
=> Rule coverage
• Use biopsy alternatives (like follow-up)
=> Pre-biopsy mammography report
• Help patients make informed decisions,
personalized medicine
=> Assigning probabilities
Conclusion
• First differential predictive rules extraction
method and application
• Personalized age-specific prediction
• New insight on:
– Palpable lump
– Recurrence

similar documents