NCI Informatics 2014
Warren Kibbe
[email protected]
The views expressed are my own and not a reflection of
DHHS, NIH or NCI policy
Some history
• Back to the dawn of time…my first BRIITE
And some very emotional imagery
• November 16 and 17th
• November 16 and 17th
Fast forward to the end of 2011
• BSA informatics working group assembled
in 2011
• BSA IWG report of 2011
• Ken steps down under enormous pressure
and criticism in December 2011
• George Komatsoulis appointed acting
• CBIIT is pounded by waves of uncertainty
CBIIT, 2013
71 Federal staff
Serving 6500 NCI staff across 18 buildings
5 petabytes of NCI data
2.5 petabytes of TCGA data
2- 5 MW new data centers
Just completed a rollout of Unified
Communications to 5000 NCI staff
– 1.5 FTEs, now on loan to NIH CIT to deliver
UC to 45000 desktops
DHHS Requirements
FISMA Moderate
Complete move to IPv6 by Oct 2014
Data center consolidation
Two factor authentication
Only government furnished equipment
(‘GFE’) may connect to the network from
outside (limits on VPN)
• Compensating controls…
• Tiered network, appropriate traffic monitoring
and scanning
NCI General strategic objectives
• Reduce cancer risk – public health
• Improve cancer outcomes – better
treatment and survivorship
• Educate providers and population
• Provide informative data and powerful
NCI CBIIT Guiding Principles
• Supporting the mission of the NCI
• Lowering barriers for the cancer
• Promote the importance of informatics in
solving problems in public health,
healthcare, precision oncology, and basic
• Build communities around problems
• Aggregate and disseminate knowledge
Using computing technology to reduce the incidence,
suffering and mortality due to cancer
Highlights from the November
National Cancer Forum Policy
My outline
Disruptive technologies
Getting social
What is big data?
Open access to data
Disruptive Technologies
Steam power
Semiconductors &VLSI
• httpSystems view - end of reductionism?
• High throughput biology
Disruptive Technologies
Steam power
6.6B active mobile contracts
1.9B smart phone contracts
1.1B land lines
World population 7.1B
&VLSI design
http US:
345M active mobile contracts
287M smart phonebiology
High throughput
US population 313M
Everyone is a data provider
Ubiquitous computing
Data immersion
Getting Social
• Measuring behavior across a population
• Understanding behavior – can we provide
better risk estimates for individuals?
• Social media is a big data opportunity –
what are the ethics of big data?
• Synergize with the energy and immediacy
of patient advocates
• Patients want more data sharing – how
can we facilitate that appropriately?
This changes trial design – statistics until now has been focused on how to
design an appropriate sample so that the sample can be generalized to the
population – what happens when we measure the ENTIRE population ??
The future
• Elastic computing ‘clouds’
• Social networks
• Big Data analytics
• Precision medicine
• Measuring health
• Practicing protective medicine
Semantic and
synoptic data
before health is
Learning systems that enable
learning from every cancer patient
Open Data Access
• We need to provide data access to people
outside of biomedicine who have the skills
and training to mine and analyze data
• More access will mean more innovation
Precision Oncology
• The era of precision medicine and precision
oncology is predicated on the integration of
research, care, and molecular medicine and
the availability of data for modeling, risk
analysis, and optimal care
How do we re-engineer
translational research policies
that will enable a true learning
healthcare system?
• In a learning healthcare system, we ‘learn’
from every patient who comes in for
treatment. What is consent in this model?
What is research?
• What role is there for standardized consent?
• Are there ways to reimagine translational
research without consent? Would that help
CBIITs mission – the long form
• CBIIT will help the cancer community
coordinate, aggregate, disseminate,
promote cancer awareness, public health
data, cancer risk reduction, novel
treatments, quality of life and comparative
effectiveness data, and basic and
translational research outcomes
CBIIT strategic activities
• Promote social media as a mechanism for
communication, education, and improving
lifestyle choices
• Work productively with patient advocates
• Understand risk factors leading to cancer
• Support cancer models and modeling, e.g.
cancer initiation and progression
• Promote precision oncology
• Promote learning healthcare systems
Informatics strategic objectives
• Lower barriers to data access, analysis
and modeling
• Promote agility, flexibility, data liquidity
• Promote Open Access, Open Data, Open
Source, Open Science
• Promote semantic interoperability,
standards, CDEs and Case Report Forms
Informatics strategic objectives
• Promote mobile and BYOD for patient
reported outcomes, education, surveillance,
• Use informatics to improve and lower barriers
to clinical trials accrual
• Use informatics to blur the distinction
between care and research – support clinical
standards in research
• Identify and disseminate innovations and
practices that make research more efficient
and effective
Supporting Precision Oncology
• Help bring together imaging, molecular,
pathology, labs, and clinical data in a
highly structured and machine readable
way to enable detailed characterization
and action for individual patients
Learning Healthcare Systems
• Enable the data flowing from precision
medicine to form learning healthcare
systems, where we better characterize,
model and predict the response, outcomes
and quality of life for every cancer patient
Public Health
• As a community we already know how to
prevent 50% of the current cancer burden
world wide. Making more effective use of
social media, mhealth approaches, virtual
communities should enable us to impact
vaccination rates (HPV, EBV, mono,
hepatitis), and promote healthy lifestyles,
including diet, exercise, and smoking
Public Health
• These three factors - infectious disease,
smoking, and poor nutrition and exercise
contribute to at least 50% of our current
cancer burden. And the cost from loss of
quality of life and pain and suffering is
Lowering barriers for the community
• Improve our patient-focused materials
dissemination technology. What is our
Social Media strategy? Partnership with
education and communication, healthcare
organizations writ broadly.
Opportunities in prevention
• How do we work together as a community
to make our prevention, communication
and education researchers more effective
and translate this to effect global change.
We need to partner with social media and
technology-savvy next generation
behavioral psychologists!
Lowering barriers for the community
• Simplify the creation and distribution of
CDE-based forms. Use existing medical
terminologies (SNOMED, ICD, LOINC,
RxNorm) whenever possible. Link every
concept to UMLS as soon as feasible
Lowering barriers for the community
• Simplify access to EVS, CDEs, NIC
Thesaurus (knowledge dissemination too!)
– Ideally with NLM, CDISC, FDA, ONC, PCORI
as partners
• Creative and appropriate security – we all
will need to live in a FISMA moderate
• Simplify data access – move toward a
‘library card’ model?
Collaborate with patients
• It is still a very rare event for patients or
even patient advocates to be involved
during the planning or implementation of
any cancer informatics project
• We need to do better if we are going to
meet the needs of our patients
• When the requests are impossible for us
to meet with our existing processes and
workflow, it is time to re-design and reimplement!
Precision Oncology
• As I mentioned with EVS and CDEs, we need
to incorporate clinical standards into research
where ever and whenever appropriate
• Our ability to semantically reason and make
inference over diverse data types is critical to
realizing the goals of Precision Oncology
• NLP, ontologies, checklists, CDEs embedded
in forms will let us move to next gen data
Enabling Analytics
• If we have captured and annotated our
data using reasonable, well-defined
semantics, this will enable data mining and
Molecular Medicine
• While this goes hand in hand with
Precision Medicine, it requires a focus on
automated, well annotated data flows and
multi-stage analysis/analytics. For
instance, for next gen sequencing, there is
primary stage data, secondary stage data,
and tertiary stage data. These steps
enable useful outputs, like BAM files, from
each machine run. Imaging (functional
MRI, high def optical, PET, CAT, etc) has
similar (but more mature) data evaluation
Molecular Medicine
• Incorporating molecular results into clinical
decision support is the end game. To
make good decisions, we need to be
constantly sampling and re-evaluating the
latest outcomes. This dynamic model
presents many problems – how do we do
this with a high level of integrity and
reliability while maintaining agility?
NCI activities
Just a few…
• EVS, NCI Thesaurus, NCI Metathesaurus
• CDEs, Case Report Forms
• RAS Initiative – hub at NCI Frederick
NCI activities
Just a few…
• NCI Cloud Pilot
– How technically can we bring community
computation to large (2.5 petabyte) data sets
– What is the sustainability model?
• TCGA re-imagined – Genomics Data
– Many technologies used, many different QA and
analysis pipelines
– Standardization and re-analysis of existing data
NCI Activities
Just a few…
• MATCH trial
– Initial findings from IMPACT
– Couples molecular findings with a decision
tree for treatment
• Cooperative Groups & GBC
– Navigator
• FDA Clinical Trials Repository
– Janus
– Collaboration with the NCI
CBIIT NCIP activities
• Focus on clinical trials (MATCH, CTRP,
• Focus on translation
• Focus on imaging
• Focus on molecules
• Moving all projects to true open source
• Semantic Infrastructure: EVS, NCI
Thesaurus, Metathesaurus, CDEs, CRFs
• HubZero as a collaborative space…

similar documents