Hello all! It's Vicky here with an exciting announcement: the IASGE behavioral dataset has been published 🚀 You can view and download our dataset here: https://doi.org/10.5064/F6VOIB8H.

So in the behavioural study portion of IASGE (which you can read more about in Sarah's posts), we conducted a multimethod study to investigate how scholars learn and use Git and Git Hosting Platforms (GHP). We first conducted focus groups with mostly minimal users (i.e. we define minimal users as people who have been introduced to Git and/or GHP but have not yet fully adopted those tools into their daily workflows), then circulated a broad survey aimed at anyone working or studying in academia currently, and concluded with user-experience interviews that took a range of users through various scenarios to directly observe behaviors. This study has generated many rich datasets: three focus group transcripts, survey results (of 421 respondents), and over 40 interview transcripts 😱

We don't only study open communities, but we also participate in them and value open access to research highly! So we are very excited that our data will be available for all of you to view and reuse. We went with the Qualitative Data Repository to publish our data because our dataset is mostly qualitative and it's the best fit for that type of data (plus, I had a great experience before with depositing another dataset there!). NYU is a member institution of the QDR, which facilitates the curation of our materials. We want to give a big thank you to the staff at the QDR who handled our deposit, Robert Demgenski and Sebastian Karchar, for taking us through a rigorous and collaborative curation process! It opened up our eyes to the detail that is required to make qualitative data openly accessible.

The documentation and code in the deposit are freely accessible under a Creative Commons Attribution-Share Alike 4.0 license. The data (survey data and all transcripts) are accessible without restrictions for all registered QDR users under their Standard Download Agreement (anyone can get a QDR account without paying). The reasons that the data are under different terms than the documentation and code is because the QDR doesn't yet have a legal and governance model for sharing this type of qualitative data (e.g. non-sensitive) publicly to unregistered users. In light of that and in light of the fact we wanted to apply a different CC license to our data (to disallow remixing of our participants' words, which have lots of opportunities for harm despite our non-sensitive topic), our option to have two separate sets of licensing components with our materials was to go with the model the QDR did have established, which is that the data will be immediately downloadable only to registered users of the QDR. The IASGE team and the QDR did come to an understanding that as soon as other options become available (e.g. making the data publicly available to non-registered users under a different license from our documentation and code), they'll switch our data over. Again, we LOVE the collaborative spirit and the deep thinking with respect to publishing and preserving qualitative data!


Use our data and code! Extend our study and remix it! We have open licensing because we want others to make use of our work. Please do! In addition to the materials on the QDR, ongoing work on our survey analysis code is licensed as MIT and conducted on GitLab: https://gitlab.com/investigating-archiving-git/survey-analysis. Feel free to fork that and work within the bounds of the MIT license (which are few!).

Concluding thoughts

We hope that our IASGE data will be a rich source of further study for those interested in the proliferation, teaching and learning use, and research applications of Git and Git Hosting Platforms. If you have any feedback about our work at all (data, code, approach), feel free to email me at vicky.steeves@nyu.edu. If you see a typo or mistake in this blog post, you are welcome to make a merge request on GitLab with a fix: https://gitlab.com/investigating-archiving-git/investigating-archiving-git.gitlab.io (or email me).