Teaching practices & values of Open Science and Scholarship needs open infrastructure!

first published on the OpenScienceMOOC blog, Sept 03, 2019
doi: https://doi.org/10.59350/26vrh-8ey10

tldr;

The need to empower scholars as well as the general public to practice Open Scholarship has been a recurring theme in public scholarly discourse of the last decade and beyond. Still, the tools and platforms required to make this a reality are usually in the hands of – often US-based – for-profit companies. And while there do exist a lot of specialized platforms out there, publicly-funded and -hosted training, project management, and scholarly communications platforms are close to non-existent. With this post, we want to open a discussion about how an open networked scholarly infrastructure for learning, teaching and scholarly communications for Open Scholarship could look like.

Everything we have gained by opening content and data will be under threat if we allow the enclosure of scholarly infrastructures.

— Geoffrey Bilder, Jennifer Lin, Cameron Neylon (2015).

Open Science has been the talk of the town for quite some time. And while there has been amazing progress on a variety of levels, we still face certain blind spots when it comes to the provision of basic infrastructure that would allow us to truly make knowledge an equitable public good independent from corporate influence and control.

While much of the current focus has been on how to develop alternative open publication systems, or else how repositories can be made available to host all kinds of open data and content, the whole field of training and communications still remains generally dependent on tools and platforms that are run by commercial, corporate entities.

Let me illustrate why this is a dilemma: We at the Open Science MOOC are a non-profit, community-led project that exists out of the pure enthusiasm of its participating members. This also means that many of us are early career researchers (ECRs) who at the moment do not and / or will not have the privilege of long-term support by a home institution in the foreseeable future. This is often simply due to the fact that all of us are part of academia’s often grind-down neoliberal economy that has many of us working on short-term and part-time contracts that tend to be limited to periods usually ranging from three months to two years.

Now, with the evolution of the Open Science MOOC over the last two years towards a global network that features researchers from almost any country in the world (see our shiny new Open Science MOOC dashboard – brought to life by the amazing data magician Lisa Hehnke – for a more detailed analysis of our user base), we have begun to more and more consider foregrounding aspects of inclusivity and equity.

Right from Day One, we have always negotiated the trade-offs between easy availability of services, usability and pragmatic access, and counter-questions towards real openness that includes being able to control what happens to one’s personal and project data, particularly from proponents arguing along the lines of the Free, Libre and Open Source (FLOSS) community. Also, did we mention that with no available project budget to speak of, we have been, and still are, dependent on freely available services? Right…

Now, as Richard Stallman has it, free in the sense of a variety of open movements should entail not only the gratis – or free beer – aspect of basic subscription to commercial services, but also the libre element – as in free speech. This included long, and recurring debates about if we e.g. want to use Google Docs or rather should explore other, more privacy-phile ventures, since Google (or G-Corp, as many of us simply address the internet giant in reference to the fictional E-Corp of Mr. Robot fame – although Open Publishing sure has an E-Corp of its own 😉 ) has been feeding on our data ever since (for illustrative snapshots, see e.g. Forbes, or this account of more disturbing consequences and this more detailed follow-up opion piece on what Google does with its search algorithms).

We have long tried to run the OpenScienceMOOC’s development backbone via GitHub.com, a free service that used to be a harbinger of open source projects, but since the last few years has faced tremendous loss in support by the open source community, particularly since it was bought by Microsoft. Things have gotten worse ever since – particulary over the last few months, there’s also been a massive increase in selective refusals of service and block-outs of researchers by GitHub.com, which faced being used as a tool by the United States’ government enforcing trade policy and sanctions against countries such as Cuba and Iran (see e.g. LinuxInsider, ZDNet).

Adding to that, we began to use Slack as a group communication platform back in 2017, but since the free service only allows access to a limited backlog of messages in their archive (see Slack Free Plan limitations), and Slack as a US-based business is bound by the same legal constraints that are enforced via GitHub.com, it quickly became clear that we would need an alternative solution that allows us to reclaim control of our data, because the limited archive means that we continue to lose older messages that will become irretrievable.

Now, since such manifestations of corporate behaviour as have become obvious via the above-described incidents could directly stifle the OSMOOC’s ability to include researchers from all regions to the MOOC’s development, let alone the moral questions and consequences raised by such Orwellian selective control of one government over science, research and scholarship without due legal process, we are now looking for alternative options.

What this amounts to is overt control over our data that even transcends what Budroni et al. have recently described as a world where

“80 % of [our] data are then collected and stored in the U.S., mainly on servers in the Silicon Valley or near Seattle [, … enabling] the analysts working outside Europe […] to track and
understand habits, trends, learnings of communities […] better than the European governing bodies of the Member States or the European Commission.” (2019).

And what we are looking at is even worse, because the incidents described point to issues that are not only about tracking and analysis, but also about data security breaches, non-conformity with even basic privacy rights, and an enforcement of policies that none of us would never have considered possible.

All of this, along with the personal experiences of many members of our steering committee, now has us on the lookout for options that would allow us to take project management and communications into our own hands. In order to stay true to the open source FLOSS perspective, we now want to set up a stack of web tools that are known to embody this spirit, and have those self-hosted somewhere.

The question that we have been asking ourselves, then, is:

Are we really alone with this need for openly-available infrastructures for training, education and scholarly communication that more closely stick to open principles?

My educated guess would be, we are not … Overall, more elegant and sensible solutions to these dilemmata ranging from non-profit, community-led approaches to other forms of scholar-led ventures have been outlined by many scholars since the dawn of the internet – let me just highlight one example from the not-so-recent scholarly history of thirteen years ago, title: “A Cooperative Publishing Model for Sustainable Scholarship“. Sounds familiar? That is because I think at large – and with notable exceptions that include great initiatives such as ScholarLed, osf.io, or the Open Library of Humanities in open publishing – it seems as if we’ve been running around in circles again and again, and ever since… and by doing so, particularly the field of training and education has been left to be dominated by corporate interests.

Because in the end, it does not really matter if you rely on GitHub.com or GitLab.com, or use Google Docs or Office365 to compile your output, and communicate via Slack, Trello, Twitter, Skype, Facebook, or any other corporate service. Since all of these applications are run on US-based servers, they are all prone to be used for purposes that run counter to one’s personal privacy or institutional legal frameworks, let alone that they become utilized as tools to enforce questionable national policies… Make sure to take a minute and click through the links provided above, and consider e.g. the sudden and massive loss of content when Microsoft decided to close down its DRM-locked ebook store, or the loss of troves of user account data that happen because of lax security measures and negligence, or due to darker, nefarious reasons.

Drawing from the OpenScienceMOOC’s experiences, what we have distilled is a publicly-available tool stack that should comprise open source training and scholarly communication tools and platforms such as:

GitLab Community Edition as a self-hosted code sharing and project management platform with issue tracker, wiki, Kanban board, and more;
Mattermost as an established open source alternative to Slack for scholarly communication in self-governed fora, and with the benefit of full control of one’s data;
Nextcloud for file sharing purposes to replace commercial services such as Google Drive or Dropbox;
Collabora Online or OnlyOffice integrated to NextCloud as an alternative to the omnipresent Google Docs Suite that is still often perceived as the most versatile solution for easy collaboration.
Plus implementation of an internationally-recognized standard such as LRMI/schema.org (1, 2) for teaching and learning resources (OER), in order to make the open data and content we produce availble in a FAIR way

Also, I think that for reasons of sustainability and general acceptance among the existing camps, factions and fields of open – let alone the even more complex national and field-specific delineations – it would be best if a supranational public entity such as the EU would be hosting such tools… and of course, this in itself is not an original first, but has also been discussed in a variety of contexts by people way more knowledgeable and experienced than me 😉 (see e.g. Bosman & Kramer, 2017).

Soo, what do you think?

…because in the end, what we as scholars should have learned by now (it’s 2019, remember?) is that an altruistic provision of services is not in the best interest of corporate strategy. This is why we think it is crucial to highlight the need of further public-sector investment in the tools necessary to facilitate training, teaching and learning of and with Open Scholarship, so as to ensure that public access to knowledge in all its forms really has the chance of developing into an equitable process as proposed by the UN’s Sustainable Development Goals…

Postscriptum: The author is writing in a personal capacity. None of the above should be taken as the view or position of the author’s employers or other organisations.

References:

Bilder, G., Lin, J. and Neylon, C. (2015) Principles for Open Scholarly Infrastructure-v1, retrieved 2019/08/15, doi: 10.6084/m9.figshare.1314859
Bosman, J. & Kramer, B. (2017) The European Open Science Cloud as a commons? retrieved 2019/08/15, doi: 10.6084/m9.figshare.5537899.v1
Budroni, P., Claude-Burgelman, J. & Schouppe, M. (2019). Architectures of Knowledge: The European Open Science Cloud. ABI Technik, 39(2), pp. 130-141. retrieved 2019/08/15, doi: 10.1515/abitech-2019-2006
Campbell, L. M., and Barker, P. (2014) LRMI Implementation: Overview Issues and Experiences. Bolton, UK: Cetis, 2014. PDF.
McGonagle‐O’Connell, A. and Ratan, K. (2019), Can we transform scholarly communication with open source and community‐owned infrastructure?. Learned
Publishing, 32: 75-78. retrieved 2019/08/15, doi: 10.1002/leap.1215
Pooley, J. (2018) Scholarly communications shouldn’t just be open, but non-profit too. LSE Impact blog, retrieved 2019/08/15.
Schroeder, R. and Siegel, G. E. (2006) A Cooperative Publishing Model for Sustainable Scholarship. Library Faculty Publications and Presentations. 66. doi: 10.1353/scp.2006.0006, preprint