NCI Informatics 2014 Warren Kibbe [email protected] 240-276-7300 The views expressed are my own and not a reflection of DHHS, NIH or NCI policy Some history • Back to the dawn of time…my first BRIITE And some very emotional imagery BRIITE 2001 • November 16 and 17th BRIITE 2001 • November 16 and 17th Fast forward to the end of 2011 • BSA informatics working group assembled in 2011 • BSA IWG report of 2011 • Ken steps down under enormous pressure and criticism in December 2011 • George Komatsoulis appointed acting director • CBIIT is pounded by waves of uncertainty CBIIT, 2013 • • • • • • 71 Federal staff Serving 6500 NCI staff across 18 buildings 5 petabytes of NCI data 2.5 petabytes of TCGA data 2- 5 MW new data centers Just completed a rollout of Unified Communications to 5000 NCI staff – 1.5 FTEs, now on loan to NIH CIT to deliver UC to 45000 desktops DHHS Requirements • • • • • FISMA Moderate Complete move to IPv6 by Oct 2014 Data center consolidation Two factor authentication Only government furnished equipment (‘GFE’) may connect to the network from outside (limits on VPN) • Compensating controls… • Tiered network, appropriate traffic monitoring and scanning NCI General strategic objectives • Reduce cancer risk – public health • Improve cancer outcomes – better treatment and survivorship • Educate providers and population • Provide informative data and powerful examples NCI CBIIT Guiding Principles • Supporting the mission of the NCI • Lowering barriers for the cancer community • Promote the importance of informatics in solving problems in public health, healthcare, precision oncology, and basic research • Build communities around problems • Aggregate and disseminate knowledge Using computing technology to reduce the incidence, suffering and mortality due to cancer Highlights from the November National Cancer Forum Policy Summit My outline Disruptive technologies Getting social What is big data? Open access to data Disruptive Technologies • • • • • • Printing Steam power Transportation Electricity Antibiotics Semiconductors &VLSI design • httpSystems view - end of reductionism? • High throughput biology Disruptive Technologies • • • • • • • • • Printing Steam power Transportation World: Electricity 6.6B active mobile contracts 1.9B smart phone contracts Antibiotics 1.1B land lines World population 7.1B Semiconductors &VLSI design http US: 345M active mobile contracts 287M smart phonebiology contracts High throughput US population 313M Everyone is a data provider Ubiquitous computing Data immersion Getting Social • Measuring behavior across a population • Understanding behavior – can we provide better risk estimates for individuals? • Social media is a big data opportunity – what are the ethics of big data? • Synergize with the energy and immediacy of patient advocates • Patients want more data sharing – how can we facilitate that appropriately? This changes trial design – statistics until now has been focused on how to design an appropriate sample so that the sample can be generalized to the population – what happens when we measure the ENTIRE population ?? The future • Elastic computing ‘clouds’ • Social networks • Big Data analytics • Precision medicine • Measuring health • Practicing protective medicine Semantic and synoptic data Intervening before health is compromised Learning systems that enable learning from every cancer patient Open Data Access • We need to provide data access to people outside of biomedicine who have the skills and training to mine and analyze data • More access will mean more innovation Precision Oncology • The era of precision medicine and precision oncology is predicated on the integration of research, care, and molecular medicine and the availability of data for modeling, risk analysis, and optimal care How do we re-engineer translational research policies that will enable a true learning healthcare system? Consent • In a learning healthcare system, we ‘learn’ from every patient who comes in for treatment. What is consent in this model? What is research? • What role is there for standardized consent? • Are there ways to reimagine translational research without consent? Would that help us? CBIITs mission – the long form • CBIIT will help the cancer community coordinate, aggregate, disseminate, promote cancer awareness, public health data, cancer risk reduction, novel treatments, quality of life and comparative effectiveness data, and basic and translational research outcomes CBIIT strategic activities • Promote social media as a mechanism for communication, education, and improving lifestyle choices • Work productively with patient advocates • Understand risk factors leading to cancer • Support cancer models and modeling, e.g. cancer initiation and progression • Promote precision oncology • Promote learning healthcare systems Informatics strategic objectives • Lower barriers to data access, analysis and modeling • Promote agility, flexibility, data liquidity • Promote Open Access, Open Data, Open Source, Open Science • Promote semantic interoperability, standards, CDEs and Case Report Forms Informatics strategic objectives • Promote mobile and BYOD for patient reported outcomes, education, surveillance, eligibility • Use informatics to improve and lower barriers to clinical trials accrual • Use informatics to blur the distinction between care and research – support clinical standards in research • Identify and disseminate innovations and practices that make research more efficient and effective Supporting Precision Oncology • Help bring together imaging, molecular, pathology, labs, and clinical data in a highly structured and machine readable way to enable detailed characterization and action for individual patients Learning Healthcare Systems • Enable the data flowing from precision medicine to form learning healthcare systems, where we better characterize, model and predict the response, outcomes and quality of life for every cancer patient Public Health • As a community we already know how to prevent 50% of the current cancer burden world wide. Making more effective use of social media, mhealth approaches, virtual communities should enable us to impact vaccination rates (HPV, EBV, mono, hepatitis), and promote healthy lifestyles, including diet, exercise, and smoking cessation. Public Health • These three factors - infectious disease, smoking, and poor nutrition and exercise contribute to at least 50% of our current cancer burden. And the cost from loss of quality of life and pain and suffering is incalculable. Lowering barriers for the community • Improve our patient-focused materials dissemination technology. What is our Social Media strategy? Partnership with education and communication, healthcare organizations writ broadly. Opportunities in prevention • How do we work together as a community to make our prevention, communication and education researchers more effective and translate this to effect global change. We need to partner with social media and technology-savvy next generation behavioral psychologists! Lowering barriers for the community • Simplify the creation and distribution of CDE-based forms. Use existing medical terminologies (SNOMED, ICD, LOINC, RxNorm) whenever possible. Link every concept to UMLS as soon as feasible Lowering barriers for the community • Simplify access to EVS, CDEs, NIC Thesaurus (knowledge dissemination too!) – Ideally with NLM, CDISC, FDA, ONC, PCORI as partners • Creative and appropriate security – we all will need to live in a FISMA moderate world • Simplify data access – move toward a ‘library card’ model? Collaborate with patients • It is still a very rare event for patients or even patient advocates to be involved during the planning or implementation of any cancer informatics project • We need to do better if we are going to meet the needs of our patients • When the requests are impossible for us to meet with our existing processes and workflow, it is time to re-design and reimplement! Precision Oncology • As I mentioned with EVS and CDEs, we need to incorporate clinical standards into research where ever and whenever appropriate • Our ability to semantically reason and make inference over diverse data types is critical to realizing the goals of Precision Oncology • NLP, ontologies, checklists, CDEs embedded in forms will let us move to next gen data capture Enabling Analytics • If we have captured and annotated our data using reasonable, well-defined semantics, this will enable data mining and discovery Molecular Medicine • While this goes hand in hand with Precision Medicine, it requires a focus on automated, well annotated data flows and multi-stage analysis/analytics. For instance, for next gen sequencing, there is primary stage data, secondary stage data, and tertiary stage data. These steps enable useful outputs, like BAM files, from each machine run. Imaging (functional MRI, high def optical, PET, CAT, etc) has similar (but more mature) data evaluation Molecular Medicine • Incorporating molecular results into clinical decision support is the end game. To make good decisions, we need to be constantly sampling and re-evaluating the latest outcomes. This dynamic model presents many problems – how do we do this with a high level of integrity and reliability while maintaining agility? NCI activities Just a few… • EVS, NCI Thesaurus, NCI Metathesaurus • CDEs, Case Report Forms • RAS Initiative – hub at NCI Frederick NCI activities Just a few… • NCI Cloud Pilot – How technically can we bring community computation to large (2.5 petabyte) data sets – What is the sustainability model? • TCGA re-imagined – Genomics Data Commons – Many technologies used, many different QA and analysis pipelines – Standardization and re-analysis of existing data NCI Activities Just a few… • MATCH trial – Initial findings from IMPACT – Couples molecular findings with a decision tree for treatment • Cooperative Groups & GBC – Navigator • FDA Clinical Trials Repository – Janus – Collaboration with the NCI CBIIT NCIP activities • Focus on clinical trials (MATCH, CTRP, CTR) • Focus on translation • Focus on imaging • Focus on molecules • Moving all projects to true open source • Semantic Infrastructure: EVS, NCI Thesaurus, Metathesaurus, CDEs, CRFs • HubZero as a collaborative space… Questions?