4 Legal, Data Protection & Compliance

Draft — Not Yet Reviewed

The content in this chapter is being reviewed since Claude Code was used to convert the text from powerpoint slides to this webpage. Content may be incomplete, inaccurate, or require significant editing before use.

Using generative AI in research is not only an ethical and integrity matter — it also has legal dimensions that researchers must navigate carefully. This chapter covers the key legal frameworks affecting AI use in research, including data protection law, intellectual property, and institutional compliance requirements.

Learning Outcomes

By the end of this chapter you will be able to:

Explain the key provisions of GDPR relevant to using AI with research data
Identify intellectual property risks associated with AI-generated content
Assess whether a given research workflow involving AI is likely to be legally compliant
Locate and apply your institution’s policies on AI use and data handling
Describe categories of data that must never be entered into public AI systems

4.2 Sensitive Data Categories

Special categories under GDPR (health, ethnicity, religion, political opinions, etc.)
Research data that is sensitive for reasons beyond GDPR (national security, commercial confidentiality)
Practical rules: what must never be entered into public AI tools
Anonymisation vs. pseudonymisation — what actually protects you

4.3 Intellectual Property and Copyright

Who owns AI-generated content? Current legal landscape
Copyright in training data: ongoing litigation and implications for researchers
IP ownership of AI outputs in employment and funded research contexts
Risks of reproducing copyrighted material via AI-generated text

4.4 Institutional Policies and Compliance

Why institutional policies differ from legal minimums
How to find and interpret your institution’s AI and data governance policies
Ethics committee considerations when AI is part of a research protocol
Contractual obligations in grant agreements and research collaborations

Discussion Activity

A researcher wants to use an AI chatbot to help analyse qualitative interview transcripts. What legal and ethical questions should they ask before doing so?
If an AI tool generates a figure or a piece of text that closely resembles a copyrighted work, who is liable?
How should institutional policies on AI use be communicated and enforced — and who should be involved in writing them?
Do you think current data protection laws are adequate for governing AI in research? What gaps do you see?
Have you encountered a situation where you were unsure whether a research activity was legally compliant? What did you do?

4.5 Practical Exercises

4.5.2 Exercise 2 — Advising on a sensitive data scenario

Tool: duck.ai (free, private)

Describe a fictional research scenario to the AI: “I have survey responses from 80 participants about their mental health history, stored as a spreadsheet. I want to use an AI tool to identify themes. What legal and ethical steps must I take before doing so?” Evaluate the advice. Does the AI mention GDPR Article 9 (special category data)? Does it recommend a data processing agreement? Cross-check its advice against the plain-language summary on the ICO website (ico.org.uk).

4.5.3 Exercise 3 — Comparing legal reasoning across models

Tool: arena.ai (free, battle mode)

Submit in battle mode: “Does GDPR apply when a researcher based in Germany uses a US-hosted AI tool to process anonymised interview transcripts from EU participants?” Vote for the response with more rigorous legal reasoning. After voting, look up the EDPB guidelines on international data transfers (freely available at edpb.europa.eu). Which model came closer to the correct answer?

4.6 References

Congressional Research Service. (2023). Generative AI and Data Privacy: A Primer. Congress.gov. congress.gov/crs-reports
Choudhury, A., et al. (2025). Generative AI guidelines at top 100 QS-ranked universities: A comparative study. arXiv:2506.20463. arxiv.org/abs/2506.20463
The Turing Way Community. The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research. CC BY 4.0. book.the-turing-way.org
European Data Protection Board. (2024). Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models. EDPB. edpb.europa.eu
European Parliament and Council. (2024). Regulation (EU) 2024/1689 — Artificial Intelligence Act. Official Journal of the European Union. eur-lex.europa.eu

4.1 Data Protection and GDPR

4.2 Sensitive Data Categories

4.3 Intellectual Property and Copyright

4.4 Institutional Policies and Compliance

4.5 Practical Exercises

4.5.1 Exercise 1 — A GDPR-compliant tool in practice

4.5.2 Exercise 2 — Advising on a sensitive data scenario

4.5.3 Exercise 3 — Comparing legal reasoning across models

4.6 References