The IASGE project has been busy writing updates for our community, participating in webinars, and attending conferences. We have recently hit our six month mark on the project and have made some significant progress researching the ways in which source code produced by the scholarly community can be archived and preserved for future (re)use. In recent discussions about our project with colleagues, we have been careful to stress the importance of saving the contextual, scholarly ephemera associated with source code; not just the source code itself. But what exactly do we mean by "scholarly ephemera"? To answer this question, we wanted to take a minute to write out our definition, provide some explanations on why we are seeking to archive it as part of the scholarly record, and elaborate on why it is an important way to understand source code more fully.
Hello world! Vicky here, project lead for IASGE, making my blog debut to tell you all about how great #iPres2019 was! iPres is the International Conference on Digital Preservation, held this year at the EYE Film Museum, the national museum for film in the Netherlands, located on Amsterdam’s IJ harbour.
My last project update included information about self-depositing software in platforms such as Zenodo, Figshare, and the Open Science Framework (OSF). All the repositories and project management tools discussed have integrations with GitHub and other source code hosting platforms, which contribute to stable homes for software as well as encourage software citations. In addition to these platforms, institutional repositories (IRs) offer yet another location for self-depositing software. In my research into how Git repositories are archived, I am interested in whether or not institutional repositories are practical places for source code. My investigations, which are exploratory, consider the ways in which IRs handle more complex files—such as different types of source code and software—as well as how they account for multiple versions of a particular software. As part of this, I examine the limitations and problems that have been raised about IRs generally as well as various solutions suggested for moving beyond IRs in lieu of a more federated and networked system. One of the main questions in which I am interested is, what are the benefits and drawbacks of depending on the current distributed model of IRs for long-term preservation and access to source code?
The IASGE team is deviating from our regularly scheduled program of research to talk about restrictions being placed on members of our community. And disclaimer, the opinions expressed here do not necessarily reflect those of NYU or the Sloan Foundation. On July 20, 2019, Shahin Sorkh, a computer engineering student and full-time developer, was trending on HackerNews, where someone had linked his blog post about what is it like to be a dev in Iran. This post, previously hosted on GitHub pages, 404'd on July 25th when GitHub decided to comply with U.S. export control laws, implemented on "Specially Designated Nationals (SDNs) and other denied or blocked parties under U.S. and other applicable law [...] including prohibited end uses described in 17 CFR 744", as mentioned in the Trade Controls page on GitHub Help. Users with IP addresses originating in Crimea, Sudan, Cuba, Iran, North Korea, and Syria or whose payment history or other information linked to those locations were affected.
And when we say affected...we mean that users received emails from GitHub saying that their accounts were blocked. Little-to-no advance notice was given, and no option to back up their repositories.
We, the IASGE team, have chosen to write about this because restriction to members of the Git community—even when authorized by Federal Law—has far-reaching and chilling consequences for open source, open scholarship, and for the open exchange of information and ideas.