Skip to Content
VulnerabilitiesResource Exhaustion

Resource Exhaustion

Resource exhaustion vulnerabilities allow attackers to consume excessive compute, memory, or API tokens, leading to denial of service or runaway costs.

Infinite Loop (CRITICAL)

CVSS 9.0 | CWE-835, CWE-400 | OWASP LLM10

Loop condition depends on LLM output without deterministic termination guarantee.

Vulnerable
Loop continues until LLM says 'done' - may never terminate
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()
result = ""

# Dangerous: No termination guarantee
while "done" not in result.lower():
  result = llm.invoke(f"Continue task: {result}")
  print(result)
Secure
Hard limit on iterations with timeout
from langchain.chat_models import ChatOpenAI
import time

llm = ChatOpenAI()
result = ""
MAX_ITERATIONS = 10
start_time = time.time()
TIMEOUT = 60  # seconds

for i in range(MAX_ITERATIONS):
  if time.time() - start_time > TIMEOUT:
      break
  result = llm.invoke(f"Continue task: {result}")
  if "done" in result.lower():
      break

Compliance:

  • EU AI Act: Article 15 (Accuracy & Cybersecurity)
  • NIST AI RMF: MAP 1.3 (System Reliability)

Context Exhaustion (HIGH)

CVSS 7.5 | CWE-400 | OWASP LLM09, LLM10

Unbounded accumulation of context/message history leads to exponential token consumption.

Vulnerable
Messages grow unbounded, eventually exceeding context window
messages = []

def chat(user_input):
  messages.append({"role": "user", "content": user_input})
  response = llm.invoke(messages)
  messages.append({"role": "assistant", "content": response})
  return response

# After 1000 messages, context window is exhausted
Secure
Sliding window keeps last N messages
from collections import deque

MAX_MESSAGES = 20
messages = deque(maxlen=MAX_MESSAGES)

def chat(user_input):
  messages.append({"role": "user", "content": user_input})
  response = llm.invoke(list(messages))
  messages.append({"role": "assistant", "content": response})
  return response

# Old messages automatically removed

Token Bombing (CRITICAL)

CVSS 9.0 | CWE-770 | OWASP LLM10

Excessive token consumption through crafted inputs causing resource exhaustion.

Vulnerable
No input length validation
def process_document(content):
  # No validation - attacker sends 100MB document
  return llm.invoke(f"Summarize: {content}")
Secure
Token count validation before processing
import tiktoken

MAX_TOKENS = 4000
enc = tiktoken.get_encoding("cl100k_base")

def process_document(content):
  tokens = enc.encode(content)
  if len(tokens) > MAX_TOKENS:
      raise ValueError(f"Input exceeds {MAX_TOKENS} tokens")
  return llm.invoke(f"Summarize: {content}")

Missing Rate Limits (HIGH)

CVSS 7.5 | CWE-400

API endpoints lack rate limiting allowing abuse and denial of service.

Vulnerable
No rate limiting on expensive LLM calls
@app.post("/chat")
def chat(request):
  # No rate limit - attacker can spam requests
  return llm.invoke(request.message)
Secure
Rate limiting per user/IP
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/chat")
@limiter.limit("10/minute")
def chat(request):
  return llm.invoke(request.message)

RAG Over-fetching (MEDIUM)

CVSS 5.0 | CWE-400

Retrieval-Augmented Generation fetches excessive documents causing context bloat.

Vulnerable
Fetches all matching documents
def rag_query(question):
  # Fetches unlimited documents
  docs = vectorstore.similarity_search(question)
  context = "\n".join([d.page_content for d in docs])
  return llm.invoke(f"Context: {context}\nQuestion: {question}")
Secure
Limited retrieval with relevance threshold
def rag_query(question):
  # Limit to top 3 most relevant
  docs = vectorstore.similarity_search(
      question,
      k=3,
      score_threshold=0.7
  )
  context = "\n".join([d.page_content for d in docs])
  return llm.invoke(f"Context: {context}\nQuestion: {question}")

Recursive Tool Calling (HIGH)

CVSS 7.5 | CWE-674

Tool recursively calls itself leading to exponential resource consumption.

Vulnerable
Tool can call itself infinitely
@tool
def research(topic):
  result = llm.invoke(f"Research: {topic}")
  # Dangerous: Can trigger itself recursively
  if "need more info" in result:
      return research(result)  # Infinite recursion
  return result
Secure
Depth tracking prevents infinite recursion
@tool
def research(topic, depth=0, max_depth=3):
  if depth >= max_depth:
      return "Max research depth reached"

  result = llm.invoke(f"Research: {topic}")
  if "need more info" in result:
      return research(result, depth + 1, max_depth)
  return result

All resource exhaustion vulnerabilities map to OWASP LLM10: Unbounded Consumption and should be addressed with hard limits, timeouts, and rate limiting.

Last updated on