Skip to Content
Free during beta·npx -y @inkog-io/cli scan .·Get API Key →
VulnerabilitiesResource Exhaustion

Resource Exhaustion

Resource exhaustion vulnerabilities allow attackers to consume excessive compute, memory, or API tokens, leading to denial of service or runaway costs.

Infinite Loop (CRITICAL)

CVSS 9.0 | CWE-835, CWE-400 | OWASP LLM10

Loop condition depends on LLM output without deterministic termination guarantee.

Vulnerable
Loop continues until LLM says 'done' - may never terminate
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()
result = ""

# Dangerous: No termination guarantee
while "done" not in result.lower():
  result = llm.invoke(f"Continue task: {result}")
  print(result)
Secure
Hard limit on iterations with timeout
from langchain.chat_models import ChatOpenAI
import time

llm = ChatOpenAI()
result = ""
MAX_ITERATIONS = 10
start_time = time.time()
TIMEOUT = 60  # seconds

for i in range(MAX_ITERATIONS):
  if time.time() - start_time > TIMEOUT:
      break
  result = llm.invoke(f"Continue task: {result}")
  if "done" in result.lower():
      break

Compliance:

  • EU AI Act: Article 15 (Accuracy & Cybersecurity)
  • NIST AI RMF: MAP 1.3 (System Reliability)

Context Exhaustion (HIGH)

CVSS 7.5 | CWE-400 | OWASP LLM09, LLM10

Unbounded accumulation of context/message history leads to exponential token consumption.

Vulnerable
Messages grow unbounded, eventually exceeding context window
messages = []

def chat(user_input):
  messages.append({"role": "user", "content": user_input})
  response = llm.invoke(messages)
  messages.append({"role": "assistant", "content": response})
  return response

# After 1000 messages, context window is exhausted
Secure
Sliding window keeps last N messages
from collections import deque

MAX_MESSAGES = 20
messages = deque(maxlen=MAX_MESSAGES)

def chat(user_input):
  messages.append({"role": "user", "content": user_input})
  response = llm.invoke(list(messages))
  messages.append({"role": "assistant", "content": response})
  return response

# Old messages automatically removed

Token Bombing (CRITICAL)

CVSS 9.0 | CWE-770 | OWASP LLM10

Excessive token consumption through crafted inputs causing resource exhaustion.

Vulnerable
No input length validation
def process_document(content):
  # No validation - attacker sends 100MB document
  return llm.invoke(f"Summarize: {content}")
Secure
Token count validation before processing
import tiktoken

MAX_TOKENS = 4000
enc = tiktoken.get_encoding("cl100k_base")

def process_document(content):
  tokens = enc.encode(content)
  if len(tokens) > MAX_TOKENS:
      raise ValueError(f"Input exceeds {MAX_TOKENS} tokens")
  return llm.invoke(f"Summarize: {content}")

Missing Rate Limits (HIGH)

CVSS 7.5 | CWE-400

API endpoints lack rate limiting allowing abuse and denial of service.

Vulnerable
No rate limiting on expensive LLM calls
@app.post("/chat")
def chat(request):
  # No rate limit - attacker can spam requests
  return llm.invoke(request.message)
Secure
Rate limiting per user/IP
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/chat")
@limiter.limit("10/minute")
def chat(request):
  return llm.invoke(request.message)

RAG Over-fetching (MEDIUM)

CVSS 5.0 | CWE-400

Retrieval-Augmented Generation fetches excessive documents causing context bloat.

Vulnerable
Fetches all matching documents
def rag_query(question):
  # Fetches unlimited documents
  docs = vectorstore.similarity_search(question)
  context = "\n".join([d.page_content for d in docs])
  return llm.invoke(f"Context: {context}\nQuestion: {question}")
Secure
Limited retrieval with relevance threshold
def rag_query(question):
  # Limit to top 3 most relevant
  docs = vectorstore.similarity_search(
      question,
      k=3,
      score_threshold=0.7
  )
  context = "\n".join([d.page_content for d in docs])
  return llm.invoke(f"Context: {context}\nQuestion: {question}")

Recursive Tool Calling (HIGH)

CVSS 7.5 | CWE-674

Tool recursively calls itself leading to exponential resource consumption.

Vulnerable
Tool can call itself infinitely
@tool
def research(topic):
  result = llm.invoke(f"Research: {topic}")
  # Dangerous: Can trigger itself recursively
  if "need more info" in result:
      return research(result)  # Infinite recursion
  return result
Secure
Depth tracking prevents infinite recursion
@tool
def research(topic, depth=0, max_depth=3):
  if depth >= max_depth:
      return "Max research depth reached"

  result = llm.invoke(f"Research: {topic}")
  if "need more info" in result:
      return research(result, depth + 1, max_depth)
  return result

All resource exhaustion vulnerabilities map to OWASP LLM10: Unbounded Consumption and should be addressed with hard limits, timeouts, and rate limiting.

Scan for resource exhaustion
$npx -y @inkog-io/cli scan .
Free during beta · 60s scan · Get API Key →
Last updated on