Resource Exhaustion
Resource exhaustion vulnerabilities allow attackers to consume excessive compute, memory, or API tokens, leading to denial of service or runaway costs.
Infinite Loop (CRITICAL)
CVSS 9.0 | CWE-835, CWE-400 | OWASP LLM10
Loop condition depends on LLM output without deterministic termination guarantee.
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI()
result = ""
# Dangerous: No termination guarantee
while "done" not in result.lower():
result = llm.invoke(f"Continue task: {result}")
print(result)from langchain.chat_models import ChatOpenAI
import time
llm = ChatOpenAI()
result = ""
MAX_ITERATIONS = 10
start_time = time.time()
TIMEOUT = 60 # seconds
for i in range(MAX_ITERATIONS):
if time.time() - start_time > TIMEOUT:
break
result = llm.invoke(f"Continue task: {result}")
if "done" in result.lower():
breakCompliance:
- EU AI Act: Article 15 (Accuracy & Cybersecurity)
- NIST AI RMF: MAP 1.3 (System Reliability)
Context Exhaustion (HIGH)
CVSS 7.5 | CWE-400 | OWASP LLM09, LLM10
Unbounded accumulation of context/message history leads to exponential token consumption.
messages = []
def chat(user_input):
messages.append({"role": "user", "content": user_input})
response = llm.invoke(messages)
messages.append({"role": "assistant", "content": response})
return response
# After 1000 messages, context window is exhaustedfrom collections import deque
MAX_MESSAGES = 20
messages = deque(maxlen=MAX_MESSAGES)
def chat(user_input):
messages.append({"role": "user", "content": user_input})
response = llm.invoke(list(messages))
messages.append({"role": "assistant", "content": response})
return response
# Old messages automatically removedToken Bombing (CRITICAL)
CVSS 9.0 | CWE-770 | OWASP LLM10
Excessive token consumption through crafted inputs causing resource exhaustion.
def process_document(content):
# No validation - attacker sends 100MB document
return llm.invoke(f"Summarize: {content}")import tiktoken
MAX_TOKENS = 4000
enc = tiktoken.get_encoding("cl100k_base")
def process_document(content):
tokens = enc.encode(content)
if len(tokens) > MAX_TOKENS:
raise ValueError(f"Input exceeds {MAX_TOKENS} tokens")
return llm.invoke(f"Summarize: {content}")Missing Rate Limits (HIGH)
CVSS 7.5 | CWE-400
API endpoints lack rate limiting allowing abuse and denial of service.
@app.post("/chat")
def chat(request):
# No rate limit - attacker can spam requests
return llm.invoke(request.message)from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.post("/chat")
@limiter.limit("10/minute")
def chat(request):
return llm.invoke(request.message)RAG Over-fetching (MEDIUM)
CVSS 5.0 | CWE-400
Retrieval-Augmented Generation fetches excessive documents causing context bloat.
def rag_query(question):
# Fetches unlimited documents
docs = vectorstore.similarity_search(question)
context = "\n".join([d.page_content for d in docs])
return llm.invoke(f"Context: {context}\nQuestion: {question}")def rag_query(question):
# Limit to top 3 most relevant
docs = vectorstore.similarity_search(
question,
k=3,
score_threshold=0.7
)
context = "\n".join([d.page_content for d in docs])
return llm.invoke(f"Context: {context}\nQuestion: {question}")Recursive Tool Calling (HIGH)
CVSS 7.5 | CWE-674
Tool recursively calls itself leading to exponential resource consumption.
@tool
def research(topic):
result = llm.invoke(f"Research: {topic}")
# Dangerous: Can trigger itself recursively
if "need more info" in result:
return research(result) # Infinite recursion
return result@tool
def research(topic, depth=0, max_depth=3):
if depth >= max_depth:
return "Max research depth reached"
result = llm.invoke(f"Research: {topic}")
if "need more info" in result:
return research(result, depth + 1, max_depth)
return resultAll resource exhaustion vulnerabilities map to OWASP LLM10: Unbounded Consumption and should be addressed with hard limits, timeouts, and rate limiting.