Science Gateways and their role in Reproducibility Nancy Wilkins-Diehr San Diego Supercomputer Center [email protected] That reproducibility is a problem has already been established, but • Brian Granger (IPython developer) talk at UCSD, May 2014 – Computing (thus software) is one of the foundations of data science – Important decisions being made on these data • Political, financial, institutional, peer review system, social – Several recent examples of errors in academic data • “Growth in a Time of Debt”, Reinhard and Rogoff (2010), Herndon, Ash, Pollin (critique, 2013) • “Capital in the 21st Century”, Piketty (2014) • BICEP2 (2014) Software designed for specific purposes • May do what it does well, but if it’s not designed to enforce reproducibility it will be nearly impossible for a user to achieve that – Excel – almost impossible to design a reproducible experiment – Github – almost impossible not to design a reproducible experiment Many have used science gateways to address reproducibility • IPython notebooks – Perez, Fernando, Brian E. Granger, and C. P. S. L. Obispo. "An Open Source Framework For Interactive, Collaborative And Reproducible Scientific Computing And Education." (2013). • Galaxy – Goecks, Jeremy, Anton Nekrutenko, and James Taylor. "Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences." Genome Biol 11.8 (2010): R86. • VisTrails – Freire, Juliana. "Making computations and publications reproducible with vistrails." Computing in Science & Engineering 14.4 (2012): 18-25. • nanoHUB – Lundstrom, Mark, and Gerhard Klimeck. "The NCN: science, simulation, and cyber services." Emerging Technologies-Nanoelectronics, 2006 IEEE Conference on. IEEE, 2006. Issues for discussion • What issues do gateway designers need to consider for reproducibility so they can follow the Github model and not the Excel model? • What happens when a software framework itself goes away? What needs to be considered? • What does it mean to be reproducible for the long term? How long? How is this possible?