Towards Greater Software Sustainability

Research is reliant on software and computational methods. Yet, software developed across different disciplines has only recently been discussed in terms of its intellectual contributions and its status as a scholarly research output. In contrast, data's profile has increased in academia due to research data management roles, Data Management Plans (DMPs), data curation protocols, and the FAIR principles and more generally due to discussions of "big data", data journalism, and data science. In comparison, software has lagged in terms of recognition, policy, and professional roles specifically aimed at software curation and preservation, both in and out of the library field. The source code behind software, however, can be viewed as an entry point—one necessary for making visible otherwise inaccessible information. This blog post discusses what (and who) is moving the conversation on the recognition of software as a form of knowledge production and highlights some interesting groups and organizations that are actively building communities of practice around software sustainability and preservation.

Communities of Practice

Modern software, as digitally born objects often grounded in utility, occupy an uncertain space within the scholarly ecosystem. As an important, but often unseen tool in research workflows, it is essential to define how research software can be conceptualized as both functional objects and as indexes of intellectual labor as we move to a more dynamic and inclusive way of thinking about scholarship. Currently, there are organizations at work who see the role of scientific software as being central to research and who provide best practices for software development and software sustainability. The UK-based Software Sustainability Institute, which was founded in 2010, is a good example. In a 2014 survey, the Institute found that 92% of academic respondents use research software, and 56% developed their own software. Moreover, 69% of the respondents noted that it would not be practical to do their work without software. In acknowledging the importance of software, their "better software, better research" approach has cultivated a community of practice through training researchers, supporting fellows, and advocating for their viewpoint via presentations and publications. In addition, the Software Sustainability Institute develops and sustains best practices for software citation, support data and Software Carpentry workshops in the UK, and champions a relatively new role in academia—Research Software Engineers. They also serve as a consultancy service, offering users online software evaluation as well as a Research Software Healthcheck, both of which serve as a support for improving sustainability and viability for software.

This type of work is also being conducted by the US Research Software Sustainability Institute (URSSI), an organization which "focus[es] on the entire research software ecosystem — including the people who create, maintain, and use research software — to validate and address various classes of concerns impacting all software development and maintenance projects across all of NSF." Of particular interest is URSSI's Winter School, a 2.5 day workshop that teaches early career researchers best practices for software sustainability. The work that is being done by the Software Sustainability Institute and the URSSI are certainly ones that can inform other initiatives, including work being done in library settings that support domain experts.

Important themes central to the mission of software sustainability, especially regarding software as a research object, have been addressed in recent scholarship. For instance, a paper titled "Raising the Profile of Research Software" by Akhmerov, et. al., (2019) provides institutions and organizations concrete ways of recognizing how deeply imbricated software is in research by making recommendations related to four main areas: software availability and quality, software sustainability, training, and human capital. These areas, in turn, suggest strategies for formalizing the roles of Research Software Engineers and Data Stewards as agents of meaningful change in the way research software is designed, preserved, and made available. It further suggests that the FAIR principles can, and should, be applied to software and asserts that professional development and training, policy development, and collaboration can be used to inform ways that software and data are maintained over time. This is a significant step as, traditionally, the FAIR principles have only been applied to data. Likewise, a paper titled "Towards FAIR principles for research software" has taken the 15 FAIR Guiding Principles (2016) and modified them to meet the requirements of research software. These revisions consider the complexities of research software—including its distinct needs related to dependencies, interoperability, versioning, and metadata—and adopt, adapt, or reinterpret the FAIR principles in a way that provides a basis for further work in applying them to research software. The paper, in turn, provides a starting point for future community-based discussions aimed at formalizing FAIR Principles for research software as well as metadata schemas that will inform those principles.

Other groups have added their voices to the conversation on preserving software as part of the scholarly record. Force11, for example, was founded in 2011 and is "an international coalition of researchers, librarians, publishers and research funders… [who] aim to bring about a change in modern scholarly communications through the effective use of information technology." While focused broadly on scholarly communication, Force11 advocates for software as a "first-class research object" in their manifesto, which states that software is an integral part of the scientific workflow, intrinsically linked to the data on which it acts. In turn, Force11 aims to change the landscape of scholarly communication to one where "every claim, hypothesis, argument—every significant element of the discourse—can be explicitly represented, along with supporting data, software, workflows, multimedia, external commentary, and information about provenance." As noted in a previous post, the role of software in the scholarly landscape is highlighted by groups such the Software Citation Working Group, who released the Software Citation Principles in 2016. These principles provide a consistent policy for software citation across disciplines and argued that software should be "a citable entity in the scholarly ecosystem." While that working group has concluded, the work completed is being built upon by the Software Citation Implementation Working Group, who advocate for the Software Citation Principles and help various stakeholders put them into practice.

The Software Preservation Network (SPN) has also been an impactful contributor when it comes to raising awareness regarding software, and has made some major contributions to the software preservation landscape. SPN was founded in 2016 as a volunteer network of individuals and organizations committed to the long-term preservation of software. It is funded by the Alfred P. Sloan Foundation, the Andrew W. Mellon Foundation, and the Institute for Museum and Library Services. Through member-based efforts, SPN has been an advocate for coordinating preservation strategies, mitigating duplicate efforts, and soliciting community input to "build consensus around next steps for preserving software at scale." SPN affiliated projects—such as Best Practices in Fair Use for Software Preservation, Fostering Communities of Practice: Software Preservation and Emulation in Libraries, Archives and Museums (FCoP), and Emulation-as-a-Service Infrastructure (EaaSI)—have contributed not only to the technological and legal infrastructures needed for software preservation and reuse, but also helped to formalize software preservation and software emulation roles in GLAM sectors. In addition, SPN has also endeavored to make its ideas and methods as available as possible to a wide audience. Their efforts include producing engaging and understandable podcasts on a variety of topics related to software preservation, sponsoring working groups, and the publication of articles, blog posts, and presentations. These awareness-raising efforts are another important element in helping the scholarly community reconceptualize how we think about source code and the need to capture and preserve it so that it is accessible now and in the future.

Building communities of practices and raising awareness have been helpful grounding points (and sounding boards) for software preservation efforts. Another avenue has been the creation of formalized documents that detail shared principles and call for a unified response. The UNESCO/Inria "Paris Call" from 2018 is perhaps the best example of this. The document provides a basis from which to build software preservation efforts and recognizes software as a "key component of human creativity, sustainable development, society and culture." The Call, and the report that accompanies it, recognizes the role software has in our day-to-day lives (economically, socially, and culturally) and across all academic fields. As a result, access to software (and its history) "is one of the prerequisites to ensuring accountability and transparency" in our society. Further, software developed and used in research contexts also need be accessible since academic results—which often affect policy—need to be both reputable and reproducible. The document encourages conceptualizing source code as the product of a variety of social, cultural, economic, and academic elements. The Call is important because it implores all sectors of society, from UNESCO member states to researchers, memory institutions, academic institutions, and civil society to have an active role in promoting, educating, and implementing ways to understand source code as a form of cultural heritage. Further, they call on the preservation community to establish a means of capturing and preserving this material so that it is universally available.

What's Next

In my next blog post, I will discuss and highlight projects that focus on archiving source code through programmatic means as well as those that focus on event data (i.e. activities in repositories). If you are part of an organization or effort engaged with software sustainability and do not see your work mentioned here, please feel free to contact us—we'd love to hear about what you are doing. As noted previously, ISAGE is in active conversation about all topics in our blog posts. As I move through my environmental scan of the scholarly Git landscape in general, and of software preservation in particular, and research various archival methods including, but not limited to, web archiving, self-archiving, software preservation, etc., I invite your insights, thoughts, and recommendations for further research. You can contact me or Vicky Steeves via email or submit an issue or merge request on GitLab.

Bibliography

Akhmerov, A., Cruz, M., Drost, N., Hof, C., Knapen, T., Kuzak, M., … Turkyilmaz-van der Velden, Y. (2019). Making Research Software a First-Class Citizen in Research. https://doi.org/10.5281/zenodo.2647436

Akhmerov, A., Cruz, M., Drost, N., Hof, C., Knapen, T., Kuzak, M., … van Werkhoven, B. (2019). Raising the Profile of Research Software. https://doi.org/10.5281/zenodo.3378572

Berkeley Institute for Data Science. (2019). US Research Software Sustainability Institute. Retrieved from Berkeley Institute for Data Science website: https://bids.berkeley.edu/research/us-research-software-sustainability-institute-urssi

Castagné, M. (2013). Consider the Source: The Value of Source Code to Digital Preservation Strategies. Student Research Journal, 2(2), 1–11.

Crouch, S., Hong, N. C., Hettrick, S., Jackson, M., Pawlik, A., Sufi, S., … Parsons, M. (2013). The Software Sustainability Institute: Changing Research Software Attitudes and Practices. Computing in Science Engineering, 15(6), 74–80. https://doi.org/10.1109/MCSE.2013.133

Hasselbring, W., Carr, L., Hettrick, S., Packer, H., & Tiropanis, T. (2019). FAIR and Open Computer Science Research Software. ArXiv:1908.05986 [Cs]. Retrieved from http://arxiv.org/abs/1908.05986

Institut national de recherche en informatique et en automatique. (2019). Paris Call: Software Source Code as Heritage for Sustainable Development. Retrieved from https://unesdoc.unesco.org/ark:/48223/pf0000366715.locale=fr

Kuzak, M., Cruz, M., Thiel, C., Sufi, S., & Eisty, N. (2018). Making Software a First-Class Citizen in Research | Software Sustainability Institute. Retrieved November 11, 2019, from https://software.ac.uk/blog/2018-11-28-making-software-first-class-citizen-research

Meyerson, J., Vowell, Z., Hagenmaier, W., Leventhal, A., Rios, F., Roke, E. R., & Walsh, T. (2017). The Software Preservation Network (SPN): A community effort to ensure long term access to digital cultural heritage. D-Lib Magazine, 23(5–6), 1p. https://doi.org/10.1045/may2017-meyerson

Meyerson, J. (2018). Episode 4: Software in Digital/Scholarly Communications – Saving Software Together. Retrieved July 15, 2019, from https://www.softwarepreservationnetwork.org/blog/spn-webinar-2018-ep4/

Meyerson, J. (2018). Community Cultivation and the Software Preservation Network. Retrieved November 5, 2019, from Digital Preservation Coalition website: https://www.dpconline.org/blog/idpd/community-cultivation-and-spn

Research Software Alliance – Promoting research software as a fundamental and vital component of global research. (n.d.). Retrieved from https://www.researchsoft.org/

Software Preservation Network. (2019). Saving Software Together. Retrieved from Software Preservation Network website: https://www.softwarepreservationnetwork.org/

Software Sustainability Institute. (2019). The Software Sustainability Institute. Retrieved from https://www.software.ac.uk/

US Research Software Sustainability Institute. (n.d.). Retrieved from http://urssi.us/