AutoGen (AG2)
Static analysis for Microsoft AutoGen applications to detect conversation loops, unsafe code execution, and missing termination conditions.
Quick Start
inkog scan ./my-autogen-appWhat Inkog Detects
| Finding | Severity | Description |
|---|---|---|
| GroupChat Loop | CRITICAL | Agents chatting forever without termination |
| Code Execution Risk | CRITICAL | UserProxyAgent with unrestricted code execution |
| No Auto Reply Limit | HIGH | Missing max_consecutive_auto_reply |
| No Termination | HIGH | human_input_mode="NEVER" without safeguards |
| Nested Chat Depth | HIGH | Unbounded nested conversation depth |
GroupChat Infinite Loops
GroupChat without termination conditions allows agents to converse indefinitely.
Vulnerable
No round limit or termination check
from autogen import GroupChat, GroupChatManager
groupchat = GroupChat(
agents=[agent1, agent2, agent3],
messages=[]
# No max_round, no termination condition
)
manager = GroupChatManager(groupchat=groupchat)
# Agents chat forever
agent1.initiate_chat(manager, message="Start discussion")Secure
Round limit and termination message detection
from autogen import GroupChat, GroupChatManager
def is_termination_msg(msg):
content = msg.get("content", "").lower()
return any(word in content for word in ["TERMINATE", "DONE", "COMPLETE"])
groupchat = GroupChat(
agents=[agent1, agent2, agent3],
messages=[],
max_round=10 # Maximum 10 conversation rounds
)
manager = GroupChatManager(
groupchat=groupchat,
is_termination_msg=is_termination_msg
)
agent1.initiate_chat(
manager,
message="Start discussion",
max_turns=15 # Additional safety limit
)UserProxyAgent Code Execution
UserProxyAgent can execute arbitrary code, creating severe security risks.
Vulnerable
Unrestricted code execution on host
from autogen import UserProxyAgent
user_proxy = UserProxyAgent(
name="user",
human_input_mode="NEVER",
code_execution_config={
"work_dir": ".",
"use_docker": False # Direct execution!
}
)
# Assistant can ask to run ANY code
# UserProxy executes without reviewSecure
Disabled or Docker-sandboxed execution
from autogen import UserProxyAgent
user_proxy = UserProxyAgent(
name="user",
human_input_mode="TERMINATE", # Require approval
code_execution_config=False # Disable code execution
)
# Or use Docker sandbox
user_proxy = UserProxyAgent(
name="user",
human_input_mode="NEVER",
code_execution_config={
"work_dir": "/sandbox",
"use_docker": True, # Sandboxed execution
"timeout": 60, # 60 second timeout
"last_n_messages": 3 # Limit context
}
)Missing Auto-Reply Limits
Without max_consecutive_auto_reply, agents auto-respond indefinitely.
Vulnerable
Unlimited back-and-forth messaging
assistant = AssistantAgent(
name="assistant",
llm_config=llm_config
# No max_consecutive_auto_reply
)
user = UserProxyAgent(
name="user",
human_input_mode="NEVER"
# No max_consecutive_auto_reply
)
# Can ping-pong forever
user.initiate_chat(assistant, message="Hello")Secure
Limited auto-replies per agent
assistant = AssistantAgent(
name="assistant",
llm_config=llm_config,
max_consecutive_auto_reply=5 # Stop after 5 auto-replies
)
user = UserProxyAgent(
name="user",
human_input_mode="NEVER",
max_consecutive_auto_reply=5
)
# Conversation ends after 5 exchanges max
user.initiate_chat(
assistant,
message="Hello",
max_turns=10 # Belt and suspenders
)NEVER Mode Without Safeguards
human_input_mode="NEVER" is dangerous without additional protections.
Vulnerable
Fully autonomous with no safety checks
user = UserProxyAgent(
name="user",
human_input_mode="NEVER", # No human oversight
code_execution_config={"use_docker": False}
)
# Agent runs autonomously with code execution
# No human approval, no limitsSecure
Termination conditions and operation filtering
def custom_human_input(prompt):
"""Auto-approve safe operations, escalate dangerous ones."""
if "delete" in prompt.lower() or "rm " in prompt.lower():
return "STOP" # Halt on dangerous operations
return "" # Auto-continue for safe operations
user = UserProxyAgent(
name="user",
human_input_mode="NEVER",
code_execution_config=False,
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: "DONE" in x.get("content", "")
)
# Or use TERMINATE mode for critical operations
user_for_critical = UserProxyAgent(
name="user_critical",
human_input_mode="TERMINATE" # Always require approval
)Nested Chat Depth
Nested chats can create recursion without depth limits.
Vulnerable
Unbounded recursion through nested chats
# Nested chats without depth tracking
def nested_chat_function(message):
inner_chat = UserProxyAgent(...)
inner_chat.initiate_chat(...) # Can nest indefinitely
agent = AssistantAgent(
name="assistant",
llm_config=llm_config
)
agent.register_nested_chats([nested_chat_function])Secure
Explicit depth tracking and limits
# Track and limit nesting depth
class DepthTracker:
def __init__(self, max_depth=3):
self.current_depth = 0
self.max_depth = max_depth
depth = DepthTracker()
def nested_chat_function(message):
if depth.current_depth >= depth.max_depth:
return {"content": "Max depth reached", "terminate": True}
depth.current_depth += 1
try:
# Execute nested chat
result = inner_chat.initiate_chat(...)
finally:
depth.current_depth -= 1
return result
agent.register_nested_chats([nested_chat_function])Token Limits
Conversations can consume unlimited tokens without controls.
Vulnerable
No control over token consumption
llm_config = {
"model": "gpt-4",
"api_key": api_key
# No token limits
}
assistant = AssistantAgent(
name="assistant",
llm_config=llm_config
)Secure
Response limits and token tracking
llm_config = {
"model": "gpt-4",
"api_key": api_key,
"max_tokens": 1000, # Limit response tokens
"temperature": 0.7
}
# Configure token tracking
from autogen import token_count_utils
def check_token_budget(messages, max_tokens=10000):
total = sum(token_count_utils.count_token(m) for m in messages)
return total < max_tokens
assistant = AssistantAgent(
name="assistant",
llm_config=llm_config,
system_message="Be concise. Limit responses to key points."
)Best Practices
- Set
max_roundon GroupChat (recommended: 10-20) - Define
is_termination_msgto detect conversation end - Set
max_consecutive_auto_replyon all agents (recommended: 5-10) - Disable code execution or use Docker sandboxing
- Avoid
human_input_mode="NEVER"without additional safeguards - Track nested chat depth with explicit counters
CLI Examples
# Scan AutoGen project
inkog scan ./my-autogen-app
# Check for code execution risks
inkog scan . -severity critical
# Verbose output for debugging
inkog scan . -verboseRelated
- CrewAI - Alternative multi-agent framework
- Code Injection
- Resource Exhaustion
Last updated on