Banner with a picture of Lucy Clague next to a title that reads "Strength in numbers: overcoming challenges in sharing data across universities by Lucy Clague"

Strength in numbers: overcoming challenges in sharing data across universities

by Lucy Clague, Senior Research Fellow, Sheffield Hallam University

The five YCEDE partners are committed to sharing data for evaluative purposes in order to evidence programme impact. Data sharing across multiple institutions can be a complex process, particularly in relation to PGR data, which historically has not been collected by institutions in the same way as data from undergraduates. During the first year of programme delivery the YCEDE evaluation team have been running a pilot project to explore how easily categories of individual PGR applicant data will be to collect across the programme.

Before sharing data, it’s important that partners understand the General Data Protection Regulation (GDPR). One important area in particular is around data permissions. GDPR is a regulation that gives citizens in the European Union more control over their personal data. It applies to all organisations that handle personal data, including universities. The regulation requires that personal data is only processed with consent from the individual or under specific conditions. Organisations must also ensure that individuals have the right to access and correct their data.

This blog outlines five issues that the YCEDE team have had to consider before the sharing of applicant data between institutions can commence.

1. Data Sharing Agreement

In a programme where multiple universities are sharing student data; it is essential to have a data sharing agreement in place to ensure the protection of student privacy and comply with relevant laws and regulations. The importance of a data sharing agreement in this context cannot be overstated. Without such an agreement, there is a risk of sensitive student information being used in ways that are not authorised or ethical. A data sharing agreement can help mitigate these risks by setting out clear guidelines for how data should be collected, stored, and shared, as well as who has access to it and under what conditions.

It is important that partners sign up to the programme data sharing agreement and understand its iterative nature. YCEDE is an evolving project that involves continuous and incremental delivery which lead to necessary adjustments throughout its lifetime.  As the project progresses, new data may be collected, or the focus may shift. Therefore, the data sharing agreement may need to be reviewed, updated, and re-signed periodically to ensure that it remains relevant and accurate.

2. Data Permissions

Each institution in the YCEDE partnership may have different policies and procedures when it comes to data permissions. There are no binary options when it comes to how each institution will gain permissions from PGR applicants, some may be happy to put something in their applicant privacy notice, whilst others may want to gain opt-in consent from each individual student.

It is important for each institution to consider their own approach when sharing data with another partner institution and ensure that they are complying with this before any data is transferred externally.

3. Complexities of Data Extraction

Data extraction is a complex issue that needs to be considered. Data may be stored in different formats, in multiple datasets, and across several teams. Additionally, at postgraduate level, data may be collected less rigorously than at the undergraduate level due to there being fewer mandatory requirements from HESA. This can result in incomplete or missing data, which can compromise the accuracy of the analysis, making it difficult to gain a comprehensive understanding.

Data may also need to be transformed or cleaned before it can be shared. This can be a time-consuming process but is necessary to ensure that the data is accurate and can be used effectively for research purposes.

It’s important for the evaluators to match as many of the requested data field requests to the commonly used HESA codes as possible, and where this cannot be achieved, and partners are unable to extract it in any meaningful way, be flexible in terms of refining evaluation data collection tools.

Thus, careful consideration and planning are required to ensure that the data extraction process is thorough and accurate, in order to produce meaningful insights.

4. Data Breach Risks

Data breaches are a significant risk when sharing data outside of an institution. It’s essential to ensure that the data is stored securely and that appropriate measures are taken to protect the data during transfer. Data encryption and secure transfer protocols should be used to protect the data during transfer.

In addition, it’s important to consider the potential impact of a data breach. The consequences of a data breach can be severe, including financial penalties, reputational damage, and loss of trust.

5. Institutional Cross-Team Discussions

To ensure that data sharing is done effectively and safely, there needs to be discussions between programme leads, data teams, and data protection teams. These discussions should address the legal and ethical considerations of data sharing such as ensuring that the data is being used for research purposes and that appropriate permissions have been obtained from individuals and making sure the institution is adhering to the data sharing agreement.

Discussions should also focus on the data extraction and whether the requested data will be reliable and accessible, risks of data breaches, and ensuring that all parties involved understand their roles and responsibilities.

In conclusion, data sharing across the YCEDE partnership presents significant challenges that have needed to be considered to ensure that the data is shared effectively, safely and enables the evaluation team to evidence impact. Historically it has been difficult to compare PGR data across the sector, but we hope that we will be able to make progress through our cross-partnership work to improve data collection processes going forward that will be beneficial to the sector as a whole.