AI/ML Pentest
HomeGitHubPortfolioTwitter/XMediumCont@ct
  • Introduction
  • OWASP and LLM
  • OWASP Top 10 for LLM App - 2025
  • Labs
Powered by GitBook
On this page
  • OWASP Top 10 for Large Language Model Applications
  • OWASP Top 10 for LLM Applications 2025
  • LLM01:2025 Prompt Injection
  • LLM02:2025 Sensitive Information Disclosure
  • LLM03:2025 Supply Chain Vulnerabilities
  • LLM04: Data and Model Poisoning
  • LLM05:2025 Improper Output Handling
  • LLM06:2025 Excessive Agency
  • LLM07:2025 System Prompt Leakage
  • LLM08:2025 Vector and Embedding Weaknesses
  • LLM09:2025 Misinformation
  • LLM10:2025 Unbounded Consumption

OWASP Top 10 for LLM App - 2025

PreviousOWASP and LLMNextLabs

Last updated 5 months ago

OWASP Top 10 for Large Language Model Applications

OWASP Top 10 for LLM Applications 2025

With the growing adoption of Large Language Models (LLMs) across industries, securing these systems has become a critical concern. The OWASP Top 10 for LLM Applications 2025 highlights the most pressing security risks and provides actionable strategies to mitigate them. This document serves as a foundational guide for developers, data scientists, and security professionals aiming to create robust and safe AI applications.

The 2025 list introduces several updates based on real-world feedback and emerging vulnerabilities, reflecting the evolving landscape of AI deployment. Key additions include expanded discussions on unbounded consumption, vector and embedding weaknesses, and system prompt leakage, each tailored to address unique challenges in modern LLM systems.

What’s New in the 2025 Edition

The updated list incorporates:

  • Unbounded Consumption: A broader perspective on resource management risks, extending beyond traditional denial-of-service vulnerabilities.

  • Vector and Embedding Weaknesses: Guidance on securing Retrieval-Augmented Generation (RAG) workflows and embedding-based techniques.

  • System Prompt Leakage: Insights into mitigating risks associated with unintended exposure of system prompts in LLM-integrated applications.

  • Excessive Agency: Expanded coverage on risks stemming from autonomous LLM agent architectures.


LLM01:2025 Prompt Injection

This vulnerability occurs when malicious prompts alter the LLM’s behavior or output. Prompt injections may be direct (user-supplied) or indirect (hidden within external content).

Examples:

  • Direct: Injecting malicious text into a chatbot to bypass safety protocols.

  • Indirect: Hidden commands embedded in web content causing unintended model outputs.

  • Multimodal: Embedding instructions within an image that, when processed alongside text, manipulates the model’s behavior.

Mitigation:

  1. Constrain model behavior with explicit guidelines.

  2. Validate outputs against predefined formats.

  3. Conduct adversarial testing regularly.

  4. Segregate and clearly label external content.

  5. Require human review for high-stakes outputs.


LLM02:2025 Sensitive Information Disclosure

LLMs can inadvertently expose private or sensitive data through poor configuration or training on sensitive datasets.

Examples:

  • Querying an LLM integrated with a knowledge base and receiving confidential business data.

  • An LLM trained on unfiltered customer support tickets inadvertently revealing personal information.

Mitigation:

  • Apply strict data sanitization during model training.

  • Implement robust output filtering mechanisms.

  • Regularly audit training datasets for sensitive information.


LLM03:2025 Supply Chain Vulnerabilities

Risks associated with the use of third-party libraries, datasets, or plugins that introduce insecure dependencies.

Examples:

  • A compromised library integrated into an LLM application allowing unauthorized access.

  • Malicious modifications in third-party datasets altering model behavior.

Mitigation:

  • Conduct comprehensive audits of third-party integrations.

  • Monitor and update dependencies proactively.

  • Employ hash verification for external data and libraries.


LLM04: Data and Model Poisoning

Attacks aimed at corrupting the training or fine-tuning process to embed malicious behaviors.

Examples:

  • Introducing poisoned samples into a public dataset used for fine-tuning.

  • Embedding subtle adversarial patterns into training data to manipulate outputs.

Mitigation:

  • Use secure, vetted data sources.

  • Monitor model behavior for anomalies post-training.

  • Conduct regular checks on the integrity of datasets.


LLM05:2025 Improper Output Handling

Errors in processing or validating LLM outputs can lead to biased or harmful content generation.

Examples:

  • Generating offensive or biased text in response to ambiguous prompts.

  • Failing to escape special characters, resulting in security vulnerabilities such as injection attacks.

Mitigation:

  • Enforce deterministic output validation rules.

  • Include human review for high-risk operations.

  • Define strict output formats and validate adherence.


LLM06:2025 Excessive Agency

Over-autonomy in LLMs can result in unintended actions, particularly in agent-based applications.

Examples:

  • An autonomous LLM agent executing financial transactions without proper safeguards.

  • Excessive permissions allowing an LLM to access and modify sensitive files.

Mitigation:

  • Implement least-privilege principles.

  • Require human approval for critical decisions.

  • Use role-based access control to limit permissions.


LLM07:2025 System Prompt Leakage

Leakage of system-level instructions can expose the LLM to manipulation or reveal sensitive operations.

Examples:

  • A user probing a chatbot to reveal its system prompts.

  • Indirect prompt leakage through verbose error messages.

Mitigation:

  • Segregate system and user prompts.

  • Conduct penetration tests focusing on prompt isolation.

  • Minimize prompt verbosity and use placeholders for sensitive terms.


LLM08:2025 Vector and Embedding Weaknesses

Vulnerabilities in embedding techniques and RAG workflows that allow attackers to manipulate data retrieval and model grounding.

Examples:

  • Injecting misleading content into a document used by a Retrieval-Augmented Generation system.

  • Modifying embeddings to subtly alter the semantic meaning of queries.

Mitigation:

  • Regularly update embedding models and their data sources.

  • Validate retrieved data for accuracy and relevance.

  • Apply semantic filters to assess data integrity.


LLM09:2025 Misinformation

LLMs can generate convincing but false information, leading to reputational or operational risks.

Examples:

  • Producing factually incorrect summaries for legal or medical contexts.

  • Generating fabricated sources when prompted for citations.

Mitigation:

  • Include fact-checking layers in deployment pipelines.

  • Limit model generation capabilities to verified sources.

  • Clearly flag uncertain or unverified outputs.


LLM10:2025 Unbounded Consumption

Overuse of computational or financial resources due to unoptimized queries or denial-of-service attacks.

Examples:

  • Unintentionally triggering expensive computations through complex queries.

  • Resource exhaustion from excessive API calls by malicious actors.

Mitigation:

  • Enforce resource usage quotas.

  • Optimize model queries for efficiency.

  • Implement rate-limiting and usage monitoring.

2025 AI Security Solutions Directory and Guide
OWASP Top 10 for Large Language Model Applications
GenAI OWASP
OWASP Top 10 for LLM Applications 2025
LLM-Pentesting-Resources/README.md at main · R3DLB/LLM-Pentesting-ResourcesGitHub
Logo
https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/