OWASP Top 10 for LLM App - 2025
Last updated
Last updated
With the growing adoption of Large Language Models (LLMs) across industries, securing these systems has become a critical concern. The OWASP Top 10 for LLM Applications 2025 highlights the most pressing security risks and provides actionable strategies to mitigate them. This document serves as a foundational guide for developers, data scientists, and security professionals aiming to create robust and safe AI applications.
The 2025 list introduces several updates based on real-world feedback and emerging vulnerabilities, reflecting the evolving landscape of AI deployment. Key additions include expanded discussions on unbounded consumption, vector and embedding weaknesses, and system prompt leakage, each tailored to address unique challenges in modern LLM systems.
The updated list incorporates:
Unbounded Consumption: A broader perspective on resource management risks, extending beyond traditional denial-of-service vulnerabilities.
Vector and Embedding Weaknesses: Guidance on securing Retrieval-Augmented Generation (RAG) workflows and embedding-based techniques.
System Prompt Leakage: Insights into mitigating risks associated with unintended exposure of system prompts in LLM-integrated applications.
Excessive Agency: Expanded coverage on risks stemming from autonomous LLM agent architectures.
This vulnerability occurs when malicious prompts alter the LLM’s behavior or output. Prompt injections may be direct (user-supplied) or indirect (hidden within external content).
Examples:
Direct: Injecting malicious text into a chatbot to bypass safety protocols.
Indirect: Hidden commands embedded in web content causing unintended model outputs.
Multimodal: Embedding instructions within an image that, when processed alongside text, manipulates the model’s behavior.
Mitigation:
Constrain model behavior with explicit guidelines.
Validate outputs against predefined formats.
Conduct adversarial testing regularly.
Segregate and clearly label external content.
Require human review for high-stakes outputs.
LLMs can inadvertently expose private or sensitive data through poor configuration or training on sensitive datasets.
Examples:
Querying an LLM integrated with a knowledge base and receiving confidential business data.
An LLM trained on unfiltered customer support tickets inadvertently revealing personal information.
Mitigation:
Apply strict data sanitization during model training.
Implement robust output filtering mechanisms.
Regularly audit training datasets for sensitive information.
Risks associated with the use of third-party libraries, datasets, or plugins that introduce insecure dependencies.
Examples:
A compromised library integrated into an LLM application allowing unauthorized access.
Malicious modifications in third-party datasets altering model behavior.
Mitigation:
Conduct comprehensive audits of third-party integrations.
Monitor and update dependencies proactively.
Employ hash verification for external data and libraries.
Attacks aimed at corrupting the training or fine-tuning process to embed malicious behaviors.
Examples:
Introducing poisoned samples into a public dataset used for fine-tuning.
Embedding subtle adversarial patterns into training data to manipulate outputs.
Mitigation:
Use secure, vetted data sources.
Monitor model behavior for anomalies post-training.
Conduct regular checks on the integrity of datasets.
Errors in processing or validating LLM outputs can lead to biased or harmful content generation.
Examples:
Generating offensive or biased text in response to ambiguous prompts.
Failing to escape special characters, resulting in security vulnerabilities such as injection attacks.
Mitigation:
Enforce deterministic output validation rules.
Include human review for high-risk operations.
Define strict output formats and validate adherence.
Over-autonomy in LLMs can result in unintended actions, particularly in agent-based applications.
Examples:
An autonomous LLM agent executing financial transactions without proper safeguards.
Excessive permissions allowing an LLM to access and modify sensitive files.
Mitigation:
Implement least-privilege principles.
Require human approval for critical decisions.
Use role-based access control to limit permissions.
Leakage of system-level instructions can expose the LLM to manipulation or reveal sensitive operations.
Examples:
A user probing a chatbot to reveal its system prompts.
Indirect prompt leakage through verbose error messages.
Mitigation:
Segregate system and user prompts.
Conduct penetration tests focusing on prompt isolation.
Minimize prompt verbosity and use placeholders for sensitive terms.
Vulnerabilities in embedding techniques and RAG workflows that allow attackers to manipulate data retrieval and model grounding.
Examples:
Injecting misleading content into a document used by a Retrieval-Augmented Generation system.
Modifying embeddings to subtly alter the semantic meaning of queries.
Mitigation:
Regularly update embedding models and their data sources.
Validate retrieved data for accuracy and relevance.
Apply semantic filters to assess data integrity.
LLMs can generate convincing but false information, leading to reputational or operational risks.
Examples:
Producing factually incorrect summaries for legal or medical contexts.
Generating fabricated sources when prompted for citations.
Mitigation:
Include fact-checking layers in deployment pipelines.
Limit model generation capabilities to verified sources.
Clearly flag uncertain or unverified outputs.
Overuse of computational or financial resources due to unoptimized queries or denial-of-service attacks.
Examples:
Unintentionally triggering expensive computations through complex queries.
Resource exhaustion from excessive API calls by malicious actors.
Mitigation:
Enforce resource usage quotas.
Optimize model queries for efficiency.
Implement rate-limiting and usage monitoring.