Managing Research Data – The Organisational Challenge at Oxford Friday 6th December, 2013 James A J Wilson [email protected] The Growing Importance of Research Data Management • Rise of data-driven research – Challenge to existing academic practices – Opportunities for new kinds of research • Increasing recognition of need to manage research data better – Opportunities for research communities – Concern for reputations – Mandates from research funders Damaro Objectives • Institutional RDM Policy • Better understanding of researchers’ requirements • Improved training & support materials – embedded in existing delivery channels • Design for connected RDM infrastructure, from planning to re-use • ‘DataFinder’ software – to act as a catalogue of research data outputs • Outputs that can be taken and adapted by other institutions (project was part of the JISC MRD Programme) • Sustainability What is Research Data Management? File organisation & local storage Data analysis & research Documentation outputs Data gathering Data deposit Literature / data review Repository storage Long-term curation [Funding bid] Planning Discovery Idea Re-use Access Principles behind Oxford’s infrastructure • Modular – Different business models for different components – May be extended (or reduced) • Researcher-focused – Caters for different disciplines and working practices • Intra-institutional – Requires input from multiple support departments and Academic Divisions Demand Demand for support with RDM from researchers Importance of RDM Essential -- My research would suffer significantly if my data were not properly managed Important -- My research benefits from the time spent managing data Helpful up to a point -- Time spent managing research data can make life easier further down the line, but it's not a very significant aspect of research Not important -- Devoting time to managing research data would be a distraction from the real work of research But fewer than a quarter had received any information about RDM from the University “My supervisor doesn’t want the whole dataset to be made publicly available as it is. However, he is very keen that whenever research papers based on the data are published, relevant portions of the data that support the findings are also published.” “Having a secure and fairly straightforward means by which to share data with selected collaborators around the world would be extremely useful.” “It would be useful for graduate students to learn to pick the appropriate tool for the appropriate question and the appropriate data … to know what their options are.” Training Desired Common RDM tasks ranked by mean level of desire for training : 5 = most desired, 1 = least desired 1 Dealing with copyright, licensing, or other IP (intellectual property) issues relating to datasets 3.55 2 Preparing datasets for long-term preservation 3.42 3 Data documentation 3.34 4 Preparing datasets for sharing with researchers outside your research group 3.27 5 Storing data securely and backing up 3.13 6 Data management planning 3.13 7 Determining whether research datasets ought to be preserved after the end of a particular project 3.11 8 Organizing and structuring data within files (e.g. for analysis) 3.02 9 Version control 2.97 10 Managing bibliographic data 2.73 11 Organizing, structuring, and naming files and folders 2.66 Demand for support with RDM from above “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner.” RCUK Common Principles on Data Policy “data must be accessible and readily located; they must be intelligible to those who wish to scrutinise them; data must be assessable so that judgments can be made about their reliability and the competence of those who created them; and they must be usable by others. For data to meet these requirements it must be supported by explanatory metadata (data about data).” Royal Society – Data as an Open Enterprise Challenges Diverse practices • Principle of subsidiarity • 45% of Departmental IT Managers reported that ‘every researcher / research group is completely free to choose how they manage their research data’ • 70% offer some departmental infrastructure to encourage a degree of standard practice (e.g. shared drives, data deposit guidelines) • 15% of departments have a departmental policy mandating particular tools and processes that researchers should use for managing their data • University RDM policy ratified in 2012, setting out responsibilities of researchers and institution Disciplinary requirements differ • Significant differences in how researchers work • Wide range of experience and confidence amongst researchers • Some disciplines already have good RDM infrastructure in place, some keen for central support – “The University should have a dedicated central repository” – “[The University should] develop a data As part of a team, with our research data managed by the team Social Sciences As part of a team, but each member of the team looks after their own data Medical Sciences Mathematic al, Physical and Life Sciences As an individual management service or be in a position to know what to recommend to our researchers” Humanities – “The desire to centralize … may work at the lower end of the data requirements, but at the higher end is rather naïve” 0% 50% 100% Some of my research is undertaken as part of a team, but I also conduct some research independently Researchers unclear where to go for support IT Services Departmental staff (including IT staff) Academics & Colleagues Research Services (including divisional & departmental) National bodies (e.g. UKDA) General web search Libraries OeRC University website Funders RDM website Other (suggestions with only 1 response) Lack of staff confidence with RDM issues Completely Confident 7 6 5 4 3 2 Not Confident 1 Solutions Who should support research data management? IT Services File organisation & local storage Data analysis & research outputs Documentation Data gathering Literature / data review [Funding bid] Academic Divisions & Departments Oxford eResearch Centre (OeRC) Planning Research Services Discovery Idea Access and re-use Data deposit Repository storage Long-term curation Library Services Role of Libraries • Metadata • Access • Workflows • Collection management • Collection curation and preservation • Service provision • Systems • But also contributions to training and good practice in earlier parts of research life-cycle Ongoing work • Research services – OxfordDMPOnline & 20 questions for RDM – Involvement of research facilitators • IT Services – Implementing services for ‘live’ data (HFS, Servers and VMs, Supercomputing, ORDS) – Research Support Group • Libraries – DataBank – DataFinder – Involvement of Subject Librarians • University coordination – Research Data Management and Open Data Working Group Coordination • Single point of contact – Central RDM website • Associated challenges – Information / data / metadata flows – RT systems – Resourcing – More organisational than technical Questions?