Smolagents

Static analysis for HuggingFace Smolagents applications to detect code execution risks, delegation loops, and sandbox escapes.

Quick Start


inkog scan ./my-smolagents-app

What Inkog Detects

Finding	Severity	Description
Code Execution	CRITICAL	`CodeAgent` without sandboxing
Tool Risk	CRITICAL	`ToolCallingAgent` with shell tools
Delegation Loop	HIGH	`ManagedAgent` circular delegation
No Execution Limit	HIGH	Missing iteration limits
Sandbox Escape	CRITICAL	Bypassing LocalPythonExecutor restrictions

CodeAgent Without Sandbox

CodeAgent executes generated Python code. Without sandboxing, this is extremely dangerous.

Vulnerable

Unrestricted code execution on host

from smolagents import CodeAgent, HfApiModel

agent = CodeAgent(
  tools=[],
  model=HfApiModel()
  # No sandbox - executes code directly!
)

# Agent can run: import os; os.system("rm -rf /")
agent.run("Calculate something")

Secure

Sandboxed with module whitelist and limits

from smolagents import CodeAgent, HfApiModel
from smolagents.local_python_executor import LocalPythonExecutor

# Create sandboxed executor
executor = LocalPythonExecutor(
  allowed_modules=["math", "statistics"],  # Whitelist only
  max_execution_time=10,  # 10 second timeout
  max_memory_mb=100  # Memory limit
)

agent = CodeAgent(
  tools=[],
  model=HfApiModel(),
  executor=executor,  # Sandboxed execution
  max_steps=5  # Limit reasoning steps
)

agent.run("Calculate the mean of [1,2,3,4,5]")

Critical Security Risk: CodeAgent without a sandbox can execute arbitrary system commands. Always use LocalPythonExecutor with strict module whitelists in production.

ToolCallingAgent with Dangerous Tools

Tools with shell or file access can be exploited.

Vulnerable

Arbitrary shell command execution

from smolagents import ToolCallingAgent, tool

@tool
def run_command(command: str) -> str:
  """Run a shell command."""
  import subprocess
  return subprocess.run(command, shell=True, capture_output=True).stdout.decode()

agent = ToolCallingAgent(
  tools=[run_command],
  model=model
)

Secure

Allowlist with timeout and output limits

from smolagents import ToolCallingAgent, tool

ALLOWED_COMMANDS = {"ls", "cat", "echo", "date"}

@tool
def safe_command(command: str) -> str:
  """Run only allowed commands."""
  import subprocess
  parts = command.split()
  if not parts or parts[0] not in ALLOWED_COMMANDS:
      return "Error: Command not allowed"

  return subprocess.run(
      parts,
      shell=False,
      capture_output=True,
      timeout=10
  ).stdout.decode()[:1000]

agent = ToolCallingAgent(
  tools=[safe_command],
  model=model,
  max_steps=10
)

ManagedAgent Delegation Loops

Managed agents delegating to each other can loop forever.

Vulnerable

Unlimited agent delegation

from smolagents import ManagedAgent, ToolCallingAgent

agent_a = ToolCallingAgent(tools=[], model=model)
agent_b = ToolCallingAgent(tools=[], model=model)

# Agents can delegate back and forth
managed_a = ManagedAgent(agent=agent_a, name="Agent A")
managed_b = ManagedAgent(agent=agent_b, name="Agent B")

manager = ToolCallingAgent(
  tools=[managed_a, managed_b],
  model=model
  # No delegation limits
)

Secure

Call limits per managed agent

from smolagents import ManagedAgent, ToolCallingAgent

class LimitedManagedAgent(ManagedAgent):
  def __init__(self, max_calls=3, **kwargs):
      super().__init__(**kwargs)
      self.call_count = 0
      self.max_calls = max_calls

  def __call__(self, *args, **kwargs):
      if self.call_count >= self.max_calls:
          return "Error: Maximum delegation limit reached"
      self.call_count += 1
      return super().__call__(*args, **kwargs)

agent_a = ToolCallingAgent(tools=[], model=model, max_steps=5)
agent_b = ToolCallingAgent(tools=[], model=model, max_steps=5)

managed_a = LimitedManagedAgent(agent=agent_a, name="Agent A", max_calls=3)
managed_b = LimitedManagedAgent(agent=agent_b, name="Agent B", max_calls=3)

manager = ToolCallingAgent(
  tools=[managed_a, managed_b],
  model=model,
  max_steps=10
)

Execution Without Limits

Agents without step limits run indefinitely.

Vulnerable

Unlimited reasoning steps

from smolagents import ToolCallingAgent

agent = ToolCallingAgent(
  tools=tools,
  model=model
  # No max_steps - runs forever
)

agent.run("Solve this complex problem")

Secure

Step limits with timeout

from smolagents import ToolCallingAgent
import asyncio

agent = ToolCallingAgent(
  tools=tools,
  model=model,
  max_steps=10,  # Maximum reasoning steps
  verbosity_level=0  # Quiet in production
)

# Additional timeout
async def safe_run(agent, task, timeout=120):
  try:
      return await asyncio.wait_for(
          asyncio.to_thread(agent.run, task),
          timeout=timeout
      )
  except asyncio.TimeoutError:
      return "Agent execution timed out"

result = await safe_run(agent, "Solve this problem")

LocalPythonExecutor Bypass

Even sandboxed execution can be bypassed without proper configuration.

Vulnerable

Dangerous modules in whitelist

from smolagents.local_python_executor import LocalPythonExecutor

executor = LocalPythonExecutor(
  allowed_modules=["os", "subprocess"]  # Dangerous modules!
)

# Agent can still run: os.system("malicious")

Secure

Minimal safe modules with strict limits

from smolagents.local_python_executor import LocalPythonExecutor

# Minimal safe modules only
SAFE_MODULES = [
  "math", "statistics", "datetime", "json",
  "re", "collections", "itertools"
]

executor = LocalPythonExecutor(
  allowed_modules=SAFE_MODULES,
  allowed_builtins=["len", "str", "int", "float", "list", "dict", "range", "sum"],
  max_execution_time=10,
  max_memory_mb=50,
  max_output_length=1000
)

# Validate before use
def is_safe_code(code: str) -> bool:
  dangerous = ["import os", "import subprocess", "exec(", "eval(", "__import__"]
  return not any(d in code for d in dangerous)

Best Practices

Always use LocalPythonExecutor with CodeAgent
Whitelist only safe modules (math, json, datetime)
Set max_steps on all agents (recommended: 5-15)
Limit managed agent calls with counters
Add execution timeouts with asyncio
Validate generated code before execution

CLI Examples


# Scan Smolagents project
inkog scan ./my-smolagents-app
 
# Focus on code execution risks
inkog scan . -severity critical
 
# Verbose debugging
inkog scan . -verbose

Code Injection
AutoGen - Similar code execution patterns
OWASP LLM08: Excessive Agency

Smolagents

Quick Start

What Inkog Detects

CodeAgent Without Sandbox

ToolCallingAgent with Dangerous Tools

ManagedAgent Delegation Loops

Execution Without Limits

LocalPythonExecutor Bypass

Best Practices

CLI Examples

Related