The Problem and the Solution to 21st Century Organizational Innovation. Trever Pearson PA 740 Professor Hyde 12.10.12 DEFINITION. Big Data is a phenomenon defined by the rapid acceleration of the expanding volume of high velocity, complex, and diverse types of data which require advanced technologies and methods to enable their collection, storage, dissemination, management, and analysis. TechAmerica Foundation (2012) WHY IS IT PROBLEMATIC? Increased velocity of available data is faster than most organizations can keep pace with. Data synthesis requires advanced technologies and appropriate staff and expertise on an ongoing basis. Implementation requires structural and organizational culture change. … And failure to respond will leave a lagging organization seriously behind. Data Storage in Exabytes Origins & Trends. 350 300 250 Data Storage. 200 150 Global Data Storage has increased from 0 to over 300 exabytes between1986 and 2007.1 100 50 0 1986 1993 2000 2007 Detail: % Exabytes 100% The type of global data stored has changed from 99% Analog in 1986 to 96% Digital in 2007 90% 80% 70% 60% Digital 50% Analog 40% 30% 20% 10% 0% 1986 1 1993 2000 2007 (5 Exabytes = 10^18 gigs: Enough to contain every word ever spoken by all humans on Earth. MGI (2011) Computation Capacity (Million Instructions per Second Origins & Trends. 350 (cont…) 300 Computational Capacity. 250 Computation capacity has grown from 0 to over 300 exabytes of traffic from 1986 to 2007. 150 200 100 50 0 1986 Information-producing devices such as, mobile phones, tablets, sensors etc… have doubled since 100% 2000. Coupled with 90% personal computing, traffic 80% 70% in these areas increased 60% from under 40 to nearly 50% 90% of all data created form 40% 1986 to 2007. 30% 1993 2000 Detail: % Million Instructions per Second Personal Computers Video Game Consoles Mobile Phones/PDA Servers and Minframes Supercomputers Pocket Calculators 20% 10% MGI (2011); Economist (2012) 2007 0% 1986 1993 2000 2007 The storage required for all of this data doubled between 1999 and 2002, a 25% compound annual growth rate. 1.8 zetabytes of data (the amount of 200 billion 2-hour HD movies) were created globally in 2011; an amount projected to double every year. 800 exabytes were created in 2009, projected to increase 44 times by 2020. It’s just like the universe, increasingly and exponentially expanding. MGI (2011) DATA TYPES. 15% Structured (database or spreadsheet data) 85% Unstructured (email, video, blogs, call center conversations, Facebook posts, Tweets, etc…) Economist (2012) DATA SOURCES. Customer transactions with personal information and consumer behavior like Visa, Amazon, etc…) Multimedia content such as High-Res health procedure videos, YouTube, etc… Social Media such as Facebook and Twitter Sensors and devices used in industries such as, retail, healthcare & automotive The effective response to Big Data is crucial for leading organizations to outperform their peers. Companies are projected to increase operating margins by more than 60% with the effective response to BIG DATA. Management decision making will be built upon evidence and information. Data driven decisions are just plain better decisions. “You don’t manage what you don’t measure”. McAffee & Brynjolfsson (2012) HOW WILL BIG DATA HELP? By… Replacing human decisionmaking with automated formulas where appropriate Reducing inefficiencies Creating transparency Discovering variability Reducing security threats and crime Increasing ability to predict mission outcomes Reducing or eliminating waste …just being innovative. MGI (2011) WHO WILL BIG DATA HELP? The five sectors to gain the most from the use of Big Data: Health Care Public Sector Administration Manufacturing Retail Business/Organization using Personal Local data WHO IS AFFECTED? HOW ARE THEY AFFECTED? The Public Policy Makers Contractors Employees Government transparency, Bureaucratic efficiency… …Privacy Informed decision-making, evidence based legislation Monitoring contract deliverables, reporting Facilitation in workplace tasks, enhanced communication, etc… Increased transparency over organizational activity OPPORTUNITIES. Data-driven organizations perform better on measures of financial and operational results than those who do not Data facilitate efficient processes, saving time and money Data lead to innovation Data will ultimately lead to funding. McAffee & Brynjolfsson (2012) CHALLENGES. Data-driven decision making and collection processes require organizational cultural change Strong Leadership is necessary to set clear goals and to ask the right questions Skillful and talented Data/IT Specialists must be on staff. Lack of statistical and technical skills in the labor force Potential cost of implementation Step 1. Source Data: Speed, Type and Amount. What kind and how much data are we working with? Assessing how hard it is to access Determining how it needs to be transformed Identifying the technologies to facilitate the process Step 2. Data Preparation: Cleansing and Verification. What do the data need for operational requirements? Define methods required for data prep such as: Standardization, verification, filtering, etc… Step 3. Data Transformation. What is required to leverage the data? Unstructured data may be broken down and presented in a structured format Data sources can be aggregated to determine not-so-obvious relationships between data types TechAmerica Foundation (2012) Step 4. Business Intelligence/Decision Support. Tools, methods, techniques to leverage data Data Mining Visualization/Simulations Keyword Searches & Syntax Analysis Step 5. Analysts/Visualization. How should the data be used? Present data visually so it can be explored Use data as is to support/enhance/improve existing organizational processes TechAmerica Foundation (2012) TechAmerica Foundation (2012) Staffing Data Analysts/IT Specialists, etc… Infrastructure Funding Data storage, Software, Hardware, Connectivity, etc… Technological investment Performance objectives related to desired mission outcomes Standards/Metrics to compare operational efficacy with mission outcomes A Data-Driven organizational culture Data Prioritization as driving force of organizational direction and the culture to support it. Openness to organizational change Data prioritization will require change! 1. Identify and Define mission objectives that need Big Data solutions 2. Assess current organizational capability, data sources, and technical requirements 3. Identify success criteria, implementation timeline, potential subsequent phases, required staffing levels, and “entry point” 1. Streams as entry point for high-velocity data needs 2. Un-bounded database/warehouse infrastructure for high-volume data needs 3. “Hadoop”1 or similar type technologies for high-variety data needs 4. Execute the plan as required 5. Review on an ongoing basis 1 The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. TechAmerica Foundation (2012), Apache Hadoop Assessment of mission outcome achievement with improvement measures including increased savings, improved efficiency, etc… Identification of gaps in the links of the process chain, if any (see slide 13) Assessment of decisions being made (are the right data available to facilitate the process?) What are your Data/IT staff telling you? • Expand and invest in the talent pool by creating a formal track for IT/Data managers with training and certification in BIG DATA Analysis and technologies. •Establish and broaden coalitions between industry academic and associations to develop professional standards and shared best practices for the field. •Expand “college-to-government service” internship programs focused on technical aspects of BIG DATA. •Strengthen and expand Office of Science and Technology Policy to facilitate further research into new techniques and their applications to important problems across program and policy sectors. •Align incentives to promote data sharing for the common good. •Provide further guidance with industry and stakeholders on privacy and data protection practices. •Develop intellectual property policies to promote innovation. •Support necessary underlying IT/Communications infrastructure MGI (2012), TechAmerica Foundation (2012) Political resistance to BIG DATA may be minimal, resulting from a history of activity including: Government (Library of Congress, Bureau of Information Resource Management) Finance (Banks, Credit Card companies) Internet search engines (Google) …HOWEVER… Bottom-up Resistance is likely The Public Privacy concerns and the notion of “Big Brother” Employees Data errors and the documentation of mistakes Contractors Less room for error, increased competition and accountability FAQs. 1. How do you know if you have a big BIG DATA problem? 2. How do you obtain insight from your data? 3. Which technology is right for my organization? 4. How long should it take to implement? 5. What skills/expertise are required on staff? 6. What about Privacy? City A.M. (2012) 1. When available data is beyond your ability to manage or when tapping into the insight it provides is problematic. 2. Start by placing mission objectives at the heart of every decision. While this might require change, even the more traditional change management practices may be of service. Let your Data staff tell you what they need. 3. It depends on your mission objectives and the type/amount/speed of data you need to inform your decisions. To start, build upon what you already have. 4. Start with small, manageable steps and allow for constant evaluation and revision. If the first phase takes longer than 6 months, you’re too slow. 5. Data Analysis and Communication, Technical skills, Database Management, and good ol’ fashioned Critical Thinking. 6. As with any data collection/sharing advancement, policies must be adjusted to address issues of privacy as they affect the organization within the context of the standards set in place (statutory or otherwise). Congress is working on it. As far as the public is concerned: Welcome to the 21st Century. Identifying what big data means to you. (2012, Feb 24). City A.M. London. McAffee, A., Brynjolfsson, E. (2012). Big Data: The Management Revolution. Harvard Business Review. Pp. 59-68. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. Byers, A.H. (2011). Big Data: The Next Frontier for Innovation, Competition and Productivity. McKinsey Global Institute (MGI). TechAmerica Foundation (2012). Demystifying Big Data: A Practical Guide to Transforming the Business of Government. Washington, D.C. Geography matters as much as ever despite digital revolution, says Patrick Lane. The Economist. (2012). Trever Pearson is a third-year Master’s student in Public Administration at San Francisco State University. With an emphasis in Policy Analysis and Finance, his interests lie mostly in evidence-based improvement in the policy arena in sectors such as health, education, finance, and income security. Trever comes from a solid background in health care policy implementation and evaluation in the San Francisco public health network. He is currently working as a Data Analyst for Curry Senior Center, a community clinic serving the elderly in San Francisco’s Tenderloin neighborhood. His achievements there include the development of agencywide data collection and reporting processes for service quality improvements and contract reporting. With coursework in Urban Administration, Financial Management and Applied Statistics, Trever aspires to use BIG DATA and research solutions for the improvement of state and federal policies and agency operations.