A MoFo Privacy Minute: Managing Cybersecurity Concerns When Fine-Tuning LLMs

Keep up with the latest legal and industry insights, news, and events from MoFo

Estimated reading time: 4 minutes

This is A MoFo Privacy Minute, where we answer the questions our clients are asking us in sixty seconds or less.

Question: Our organization wants to fine-tune a Large Language Model (LLM) for a specific domain or area of interest. Does this create any additional cybersecurity risks and how should we approach risk mitigation?

Answer:

LLMs have emerged as key tools for organizations seeking to enhance operational efficiency and to bolster innovation. These models, powered by advanced machine learning algorithms, can process and generate content across modalities (including text, images, and video), making them invaluable across various sectors.

Fine-tuning an LLM involves adapting a pre-trained LLM so that it has a more nuanced understanding of a problem. This can be achieved by conducting further training on the pre-trained LLM using a smaller, domain-specific dataset, which refines the model’s ability to understand and generate more relevant, targeted output. For example, in the healthcare industry, a fine-tuned LLM could analyze patient data more effectively to assist in specific tasks, like diagnosing medical conditions or enhancing the quality of care.

However, research indicates that fine-tuning an LLM can increase its susceptibility to security incidents.

Understanding the Vulnerabilities

Research conducted by Cisco[1] revealed that fine-tuned LLMs are more likely to be vulnerable to prompt injection attacks,[2] which can compromise the safety guardrails built into base LLMs. Especially in regulated sectors like financial services or healthcare, the consequences of such manipulation could be severe.

But why are fine-tuned LLMs more vulnerable?

Fine-tuned LLMs are more focused and calibrated to follow specific instructions relevant to the domain or area of interest for which the LLM has been adjusted, making them less resilient to unexpected or malicious inputs because they are optimized for particular types of interactions rather than a variety of novel inputs. On the other hand, base LLMs are trained on broader datasets and designed to handle a wider range of inputs. For example, if an LLM has been fine-tuned on financial records, the LLM is more likely to be able to recognize and respond to a prompt from a threat actor regarding financial records because the fine-tuning has biased the LLM to assess the prompt as legitimate. On the other hand, the base LLM would be less likely to interpret and understand such a specific prompt, and therefore less likely to provide the requested financial records.

So what can be done?

Organizations should consider the following steps to mitigate cybersecurity risk:[3]

Consider Retrieval Augmented Generation (RAG): As an alternative to fine-tuning, organizations may consider using RAG, which involves incorporating additional data into a prompt sent to an LLM and does not involve additional model training. Therefore, RAG may mitigate some of the data security risks noted above, as (unlike fine-tuning) the base LLM has not been altered by the organization’s own data. This means that the LLM will not have the increased susceptibility to prompt injection attacks that a fine-tuned LLM may have.
Secure the Dataset: Consider appropriate steps to secure the data set used for fine-tuning, such as encrypting and segregating fine-tuning datasets and implementing robust access controls to prevent data exfiltration or misuse.[4]
Implement Input Controls: Consider appropriate input filters and controls, like rate limiting, to regulate LLM interactions and prevent excessive or unauthorized use.
Continuous Monitoring: Set up systems to monitor logs and usage patterns of the model, enabling early detection of anomalies or threats.
Access Controls: Employ authentication measures, including multi-factor authentication (MFA), to reduce the likelihood of unauthorized access to the LLM.
Training: Educate employees on secure LLM usage and raise awareness of evolving social engineering techniques (such as phishing schemes leveraging fake interfaces or fraudulent integrations).
Feedback and Reporting Channels: Establish mechanisms for quick feedback that incentivize the identification and reporting of vulnerabilities and consider extending this to allow reports from external parties, for example, through initiatives like a bug bounty program. Fully assess any potential security incidents.
Establish and Follow Your Incident Response Process: Ensure that processes to handle potential data security incidents are documented through an incident response plan and routinely tested through data security incident tabletop scenarios. When something occurs, follow your processes.
Red Team Exercises: Conduct red team exercises to detect vulnerabilities. Where fine-tuning has been used, red team exercises could include a focus on the LLM’s susceptibility to prompts that would expose sensitive information contained in the fine-tuning dataset.
Keep Track of and Follow Regulatory Obligations and Regulators’ Expectations: Take steps to address the heightened cybersecurity risks and make sure you understand and comply with your obligations under evolving legal standards to require that systems and data are secure, such as the EU’s Network and Information Systems Directive II (NIS 2), the EU AI Act, and the GDPR.

Regardless of whether an organization chooses to use RAG or fine-tuning to tailor an LLM for a specific domain or area of interest, organizations should ensure that they continue to invest in cybersecurity governance alongside their efforts to innovate or risk creating further legal and regulatory issues.

[1] Cisco, 2025 Annual Report – the State of AI Security.

[2] Prompt injection attacks occur when threat actors use input prompts to alter the model’s behavior or outputs. This can result in the extraction of sensitive information or the production of harmful outputs (such as malware).

[3] Specific steps to mitigate the risks of deploying AI have also been set out in the Joint High-Level Risk Analysis on AI co-signed by a consortium of international cybersecurity bodies at the Paris AI Summit (February 2025).

[4] Isabel Barberá, AI Privacy Risks & Mitigations – Large Language Models (LLMs), European Data Protection Board Support Pools of Experts Program (March 2025).

Linda K. Clark
Partner
Dan Alam
Associate

Practices

Data, Cyber + Privacy