Date: November 4, 2021
OASPA is pleased to announce our next webinar which will focus on the interface between openness and privacy. Open Access is part of a large movement in scholarly research toward openness, often captured in terms such as Open Science, Open Licenses, Open Data, and Open Metadata. The rights of researchers to make their work open are also being championed by funders, such as Coalition S, using the Rights Retention Strategy. At the same time, in response to the ongoing intrusion of corporate and state interests into people’s private lives, the emphasis on (data) privacy has become more pronounced, for example with the advent of the EU’s General Data Protection Regulation.
In this OASPA webinar, we intend to discuss some aspects of this tension between openness and privacy: How to guarantee the privacy of researchers in the face of increasing research metrics gathering? How to deal with privacy questions with regard to large open data sets? How to balance legal privacy protections with cybersecurity concerns? How to balance openness with copyright and intellectual property rights? And what is the role of funders, publishers, universities and libraries in balancing the need for privacy with the demand for openness, and long-term preservation of that openness?
The webinar will be chaired by Vincent W.J. van Gerven Oei.
We welcome our speakers: Chris Bulock (University Library of California State University Northridge, USA), Farzaneh Badiei (Digital Medusa, USA), René Mahieu (Vrije Universiteit Brussel, Belgium), and Molly van Houweling (University of California, Berkeley, USA).
The panellists will each speak for 6 minutes, and then we will open it up to questions from the audience and have an extended discussion.
Speaker Key Takeaways
Molly van Houweling
- Open licenses like those promulgated by Creative Commons were designed to overcome unnecessary obstacles to sharing works of knowledge and creativity.
- Twenty years later, there are new and different obstacles to sharing that loom as large or larger than the automatic “all rights reserved” copyright system that prompted the founding of CC.
- These obstacles include anxiety that personal data (including, for example, recognizable images of individuals) will be used in unanticipated and unwanted ways.
- In theory, OA publications would better protect reader privacy, but in practice, the large number of intermediaries in the research workflow and the tracking practices of publishers do not guarantee this.
- When OA publications are presented in library systems alongside subscription content, there are many ways that readers are subject to tracking.
- Library systems often default to subscription access, presenting links that require authentication and identification even when open copies are available.
- There is genuine conflict between the value of open access to data and the value of protecting personal data (privacy), in research involving personal data.
- When institutions develop policies to drive researchers to provide more open access to their research data, they should acknowledge this tension between open data and privacy, and acknowledge that it can often not be easily resolved.
- Best practices need to be developed to balance these two values, taking into account the specificity of various types of research data. This will require practical research, learning from existing practices.
- We need to discuss openness and privacy at a granular level but in a more holistic way. For example, what data are we talking about when we want openness. Can there be other ways to redact the data but still undertake the research? In the cybersecurity case, privacy scholars shouldn’t talk among themselves while security scholars feel helpless about access to data. We also need to be able to measure objectively the consequences of privacy violations against the benefits of research or openness of data. One question is how much that kind of data contribute to the results. (Or the balancing between public interest and fundamental rights in GDPR terms). As researchers, academics and librarians we should be able to do the risk assessment and take initiatives.
- Sometimes it is possible to undertake research using alternative sources. We need more scholarship that references creative ways to build databases from publicly available data, but in an ethical way.
- Issue of tiered access: we also need disclosure processes for academics and researchers, make them efficient and less expensive and multistakeholder. GDPR has made this legal but in our community (for example the case of WHOIS) researchers and others need to come up with a disclosure model.
Responses to unanswered attendee questions
Q: Are there models or best practices for an organization that wants to make organizational data gathered from individuals (such as member surveys) available as open data?
RM: I am not aware of such best practices, and would love to hear if some are found. This case is exactly the type of thing my talk was about. In general (but it depends a lot on the specific data set) re-identifying data (i.e. connecting it back to a specific person) it is often more easily possible than you would expect on first sight (see for example: Narayanan, A., & Felten, E. W. (2014). No silver bullet: De-identification still doesn’t work. White Paper, Jul. http://randomwalker.info/
Q. To what extent should institutions/libraries be stewards of their researcher’s data exhuast (in terms of what resources they accessed, when, where, etc.), should they delete this data, or give this data away in hopes of better services?
CB: I think that, especially to the extent that institutions and particularly libraries are signing contracts for services and products that are part of the research workflow, they have a responsibility to ensure the privacy of the communities they serve. In the print era, most research happened in the library and through use of physical materials the library owned, but now researchers find and read material on online platforms that the library neither owns nor controls. At the very least, institutions need to be transparent about the services they’re subjecting users to, and the ways in which those services use personal data.
Resources shared in chat
- How to enable/disable privacy protection in Google Analytics (it’s easy to get wrong!)
- Authors Alliance resources (for authors interested in managing their rights and ensuring access to their works)
Chris Bulock (@chrisbulock)
Chris Bulock is the Chair of Collection Access and Management Services at the University Library of California State University Northridge. Chris is also the outgoing co-chair the California State University system’s Shared Resources and Digital Content committee and edits the column Open Dialog in the journal Serials Review. His research focuses on how libraries provide access to open access resources through their search and discovery systems.
Farzaneh Badiei (@farzanehbad)
Dr. Farzaneh Badiei is the founder of Digital Medusa, an initiative that focuses on protecting the core values of our global digital space with sound governance. For the past decade, Dr Badiei has directed and led projects about Internet and social media governance. She has undertaken research at Yale Law School, Georgia Institute of Technology and the Humboldt Institute for Internet and Society in Berlin. Her focus is on governance issues related to Internet infrastructure and social media platforms. Dr. Badiei received her PhD in law from the University of Hamburg, Institute of Law and Economics. Her dissertation focused on online private justice systems, institutional design, and online market intermediaries. Between 2011 and 2014, Dr. Badiei worked at the United Nations Internet Governance Forum Secretariat.
René Mahieu (@ReneLPMahieu)
René Mahieu is a doctoral researcher at the Research Group on Law Science, Technology and Society of the Vrije Universiteit Brussel (VUB), working on data protection legislation. The main research question of his thesis is: How and to what extent does the exercise of the right of access to personal data meet its objectives in practice? In this research he combines methods of legal scholarship and political-philosophy with empirical approaches.
Molly van Houweling (@mollysvh)
Molly Shaffer Van Houweling is the Harold C. Hohbach Distinguished Professor of Patent Law and Intellectual Property at the University of California, Berkeley, where she also serves as the Associate Dean for J.D. Curriculum and Teaching and as a Faculty Director of the Berkeley Center for Law & Technology. She is Chair of the Board of Directors of Creative Commons, a founding board member of Authors Alliance, and an Associate Reporter for the American Law Institute’s Restatement of the Law of Copyright.
Chair: Vincent W.J. van Gerven Oei (@ontakragoueke)
Note that previous OASPA webinar details and recordings can be found here.