3 Research Integrity, Authorship & Policies
The content in this chapter is being reviewed since Claude Code was used to convert the text from powerpoint slides to this webpage. Content may be incomplete, inaccurate, or require significant editing before use.
The rise of generative AI has prompted rapid changes to the norms and policies governing academic authorship, publication, and research conduct. This chapter examines what research integrity means in an AI-assisted environment, surveys the obligations set out in the major codes of conduct, and helps researchers understand their responsibilities around disclosure, authorship, and the limits of responsible AI use.
By the end of this chapter you will be able to:
- Explain the key principles of the ALLEA European Code of Conduct for Research Integrity and the Finnish Code of Conduct for Research Integrity
- Distinguish between plagiarism (a research integrity violation) and intellectual property rights infringement (a legal violation), and explain why this distinction matters for AI use
- Identify the full range of research integrity violations and questionable research practices, including those specific to AI
- Describe the role-differentiated responsibilities that developers and researchers bear with respect to AI
- Explain why generative AI systems present novel risks for fabrication, falsification, and plagiarism
- Apply current authorship norms to determine when and how AI use must be disclosed
3.1 The Codes of Research Integrity
Research integrity in Europe is governed primarily by two documents: the ALLEA European Code of Conduct for Research Integrity (2023) and, in Finland, the Finnish Code of Conduct for Research Integrity (2023). Although distinct in their jurisdiction, both codes articulate a shared understanding of what honest, reliable, and accountable research practice requires.
The ALLEA Code identifies four core principles that underpin research integrity:
- Reliability — in ensuring the quality of research, reflected in the design, methodology, analysis, and use of resources
- Honesty — in developing, undertaking, reviewing, reporting, and communicating research in a transparent, fair, full, and unbiased way
- Respect — for colleagues, research participants, society, ecosystems, cultural heritage, and the environment
- Accountability — for the research from its conception to its publication, for its management and organisation, for training and supervision, and for its broader impacts
These four pillars define the standard against which researchers’ conduct is measured. Together, they describe not just what researchers must do, but what they must be: honest agents whose work can be trusted by the institutions and publics that rely on it.
3.2 Plagiarism, Intellectual Property, and the Ethics–Law Distinction
A particularly important distinction from the ALLEA Code is the difference between plagiarism and intellectual property rights (IPR) infringement. These two concepts are often confused, but they fall under different regulatory and ethical regimes:
- Plagiarism — the unacknowledged use of another person’s work — is a violation of research ethics. It is a matter of honesty: presenting someone else’s ideas, text, or results as your own without attribution is fundamentally dishonest, regardless of whether any legal copyright is involved.
- IPR infringement — the unauthorised use of another person’s work — is a violation of law. Copyright, licensing terms, and intellectual property rights are legal constructs, and their breach is treated as a legal matter rather than (only) an ethical one.
This distinction becomes particularly relevant with generative AI. Large language models are trained on vast corpora of text and can reproduce that text — sometimes verbatim — without citing or acknowledging the original sources. When a researcher uses AI-generated text that reproduces copyrighted material without attribution, they may simultaneously be violating both research integrity (plagiarism) and intellectual property law (IPR infringement). Understanding this dual exposure is essential for anyone incorporating AI outputs into their work.
3.3 Violations of Research Integrity
Research misconduct and questionable research practices (QRPs) take many forms, ranging from the gravest deliberate falsification to more ambiguous failures of transparency. The most serious violations are:
- Fabrication — inventing data, results, or other outputs that do not exist
- Falsification — manipulating research materials, equipment, data, or results so that the research record misrepresents reality
- Plagiarism — presenting another person’s work, ideas, or text as one’s own, without appropriate acknowledgement
Beyond these cardinal violations, the ALLEA Code also identifies a range of questionable research practices that, while less clear-cut, are nonetheless unacceptable in professional research:
- Undisclosed conflicts of interest
- Misusing seniority to encourage violations or to advance one’s own career
- Delaying or hampering the work of others — for example, by acting as a malicious peer reviewer
- Misusing statistics (selective reporting, inappropriate tests, inflated effect sizes)
- Hiding the use of AI in the research process
- Withholding data or results without justification
- Chopping up results artificially to inflate publication count (salami slicing)
- Selective or inaccurate citing; expanding citations to please editors, reviewers, or colleagues
- Self-plagiarism
- Manipulating authorship — guest authorship and ghost authorship
- Establishing or supporting predatory journals and reviewer cartels
- Misrepresenting achievements (CV inflation)
- Falsely accusing others of misconduct
- Ignoring research integrity violations when they are observed
The explicit inclusion of hiding AI use in this list is significant. It reflects an emerging consensus that transparency about AI assistance is not merely good practice, but a requirement of research integrity — as binding as the prohibition on hiding data manipulations or omitting relevant citations.
3.4 Different Roles, Different Responsibilities
Generative AI does not affect all participants in the research ecosystem equally. The ALLEA Code’s framework of responsibilities applies differently depending on whether one is a developer of AI systems or a researcher and educator who uses them.

3.4.1 Developers of AI Systems
Those who build AI systems bear a different set of obligations:
- Ethics: AI ethics and responsible AI principles must be considered throughout the entire AI system lifecycle — covering Sustainability, Safety, Accountability, Fairness, Explainability, and Data Stewardship (the SSAFE-D framework described in Chapter 1)
- Law: Compliance with the EU AI Act applies when the system poses risks and is not developed solely for research purposes or personal use; GDPR and other data regulations also apply throughout the development pipeline
- Technical robustness: Developers are responsible for ensuring the correct functioning of the AI system. Crucially, large AI models are computationally demanding and cannot run locally — they are hosted in the cloud, meaning users are always operating on someone else’s infrastructure, with all the data sovereignty implications that entails
3.4.2 Researchers, Teachers, and Students
Those who use AI in research and education bear a complementary set of obligations, expressed through the Truth–Trust–Competence–Compliance framework:
- Truth: The quality of research and education, as reflected in design, methodology, analysis, and use of resources, must not be compromised by AI-generated errors, hallucinations, or unverified claims
- Trust: Work must be conducted in a transparent, fair, full, and unbiased way; undisclosed AI assistance undermines the trust that peers, institutions, and the public place in research
- Competence: Researchers must preserve and develop their own skills, judgement, and expertise — AI tools that substitute for rather than augment human critical thinking erode competence over time
- Compliance: Adherence to legal, ethical, and institutional standards remains the researcher’s responsibility, regardless of what an AI tool does or recommends
This role-differentiated view of responsibility is important because it resists two opposite errors: holding individual researchers solely accountable for the ethical failures of AI systems they did not design, while also refusing to allow researchers to evade responsibility simply because “the AI did it”.
3.5 Generative AI as a Misconduct Machine
A striking feature of current generative AI systems is that their technical design aligns almost perfectly with the conditions for research misconduct. Large language models are trained to produce fluent, confident-sounding text and are optimised to satisfy users — not to be accurate, not to cite sources, and not to acknowledge uncertainty. As a result:
- Plagiarism risk: Generative AI language models can reproduce text from their training data verbatim without citing the original source. A researcher who uses this output without verification may unknowingly plagiarise published work.
- Fabrication and falsification risk: AI tools can be used for data synthesis and data analysis — and they can be used, willingly or not, to fabricate data or falsify results. The now-famous question “how many rs are in strawberry?” illustrates that AI systems fail on tasks where precise factual accuracy is required, yet present their errors with the same confident tone as correct outputs. In data analysis contexts, this poses serious risks.
- Security and data protection risks: Uploading sensitive research data, confidential participant information, or unpublished manuscripts to cloud-based AI systems exposes them to data protection risks. The confidentiality and integrity of your data, your subjects’ data, and your intellectual property must be protected.
Aalto University’s guidelines on the Responsible Use of Artificial Intelligence in the Research Process address these risks directly and should be consulted before using any AI tool in a research context.
3.6 Disclosure Obligations and the Transparency Dilemma
Because the ALLEA Code explicitly lists hiding AI use as a research integrity violation, researchers must disclose their AI use — but the question of how and where to draw the line is genuinely difficult.
Several concrete questions illustrate the complexity:
Where is the line between acceptable use and required disclosure? The publisher Elsevier provides a useful example: using AI for spell-checking, grammar correction, and punctuation does not require disclosure. Any other use — including restructuring sentences, summarising literature, drafting sections, or generating code — does require disclosure.
Should you trust AI-generated citations and summaries? Research suggests that large language models are approximately five times more likely to overgeneralise and misinterpret scientific literature than a human researcher working from the original sources. AI-generated citation lists should never be used without verification against the original sources.
Should AI-generated code be trusted? AI tools can generate code that is syntactically correct but logically flawed, and that introduces security vulnerabilities. Cybersecurity risks of AI-generated code are an active concern in software engineering research.
Should AI be used for peer review? In some fields, studies have found that up to 17% of peer reviews show evidence of AI involvement. The practice is controversial and raises serious concerns about confidentiality and the integrity of the review process.
What about AI detection tools? Studies suggest that approximately 9% of researchers self-report using AI in their writing, while detection estimates suggest the actual figure may be closer to 36%. AI detectors are unreliable and raise questions about fairness. Paradoxically, research has also found that researchers who do disclose AI use are trusted less by readers — an uncomfortable finding that suggests transparency norms may take time to stabilise.
3.7 The Verification Imperative
A clear guiding principle emerges from these considerations:
If you do not have the ability or the resources to verify the output of a generative AI system — by finding the relevant citations for the synthesised text, by verifying that others’ work is summarised correctly, by detecting plagiarised text or code, by verifying the analysis code — then you should not use the output of that AI system beyond self-learning or brainstorming.
This principle places the burden of verification firmly on the researcher. AI tools do not verify themselves. They do not know what they do not know. The researcher is the last line of quality control, and that role cannot be delegated to the tool.
3.9 The Ethics Paradox and Responsible Use
A provocative but intellectually honest position on current generative AI tools can be stated as follows:
It is impossible to ethically use these tools, because they are not ethical under any aspect.
This is not a frivolous claim. As demonstrated in Chapter 1, major proprietary AI systems currently fail all six dimensions of the SSAFE-D responsible AI framework. Their training data is opaque, their environmental costs are substantial, their developers face no meaningful regulatory accountability, and there is substantial evidence of bias and copyright infringement in their development.
If one accepts the strictest ethical standards, the conclusion follows: one should not use or promote these tools.
Yet the real world poses a series of ethical compromises that most people navigate every day. We drive petrol-powered cars knowing their environmental costs. We consume products made in conditions that exploit labour. We use smartphones manufactured under supply chains that harm ecosystems. Our taxes fund institutions that cause harm. This does not make those choices right — but it does reflect the reality that individuals operate within imperfect systems not of their own making.
The constructive response is not paralysis but responsibility: by using AI tools with a clear understanding of their limitations, their costs, and the conditions under which they generate unreliable or harmful outputs, researchers can identify use cases where they add genuine value to their work, to the people they interact with, and to society at large — while refusing to use them in contexts where the risks outweigh the benefits, and advocating for the systemic changes that would make ethical AI possible.
- The ALLEA Code lists hiding AI use as a research integrity violation. Yet research also shows that disclosing AI use reduces readers’ trust in the researcher. How should individuals navigate this tension? Does transparency always serve research integrity?
- Apply the Truth–Trust–Competence–Compliance framework to a specific AI use case from your field. Which of the four dimensions raises the most concern, and why?
- Elsevier distinguishes between spell-checking (no disclosure required) and all other AI use (disclosure required). Is this a sensible line? Can you think of edge cases where the distinction breaks down?
- The “verification imperative” places full responsibility for checking AI outputs on the researcher. Is this a realistic standard given the volume of text AI can generate? What institutional or technical supports would make this more feasible?
- The chapter argues that it is “impossible to ethically use” current AI tools, yet concludes that responsible use is still possible. Is this position coherent? How does it compare with other ethical compromises researchers make in their everyday practice?
3.10 Practical Exercises
3.10.1 Exercise 1 — Writing a disclosure statement
Tool: duck.ai (free, private)
Imagine you used an AI tool to help draft the Discussion section of a paper, then revised it substantially. Ask the AI to help you write a disclosure statement for the manuscript. Evaluate the result: is it specific enough? Does it accurately reflect what you (hypothetically) did? Now consider: the disclosure statement itself was AI-assisted — should that be disclosed too? Revise the statement accordingly.
3.10.3 Exercise 3 — Applying the verification imperative
Tool: lumo.proton.me (free, GDPR-compliant)
Ask Lumo to summarise three recent studies on a topic in your own research area. For each study it mentions, attempt to locate the original source. How many can you verify? How many appear to be hallucinated or misattributed? Reflect on what this exercise tells you about the conditions under which AI-generated literature summaries can and cannot be trusted.
3.11 References
- ALLEA. (2023). The European Code of Conduct for Research Integrity (Revised ed.). ALLEA. allea.org
- Finnish National Board on Research Integrity TENK. (2023). The Finnish Code of Conduct for Research Integrity. tenk.fi
- Glerean, E., & Silva, P. (2025). Generative AI in Research Work — Course slides. Zenodo. doi.org/10.5281/zenodo.14032261 (CC-BY)
- Aalto University. Responsible use of Artificial Intelligence in the research process. aalto.fi
- COPE (Committee on Publication Ethics). (2023). Authorship and AI tools. COPE position statement. publicationethics.org
- ICMJE. (2024). Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals. icmje.org
- Elsevier. The use of AI and AI-assisted technologies in writing for Elsevier. elsevier.com
- Nature. Artificial intelligence (AI) policy for Nature Portfolio journals. nature.com
- Liang, W., et al. (2024). Mapping the increasing use of LLMs in scientific papers. arXiv. arxiv.org/abs/2404.01268
- Altmäe, S., et al. (2023). Artificial intelligence in scientific writing: a friend or a foe? Reproductive BioMedicine Online, 47(1), 3–9. doi.org/10.1016/j.rbmo.2023.04.009