Skip to main content
xbot
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage
Edit page

xbot Architecture

Global Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                          User Message Entry                          │
│  Feishu / QQ / NapCat / Web / CLI / Cron / !bang_command           │
└─────────────┬───────────────────────────────────────────────────────┘
              │ InboundMessage (cap=64)
              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                        Agent.processMessage()                        │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────────────┐  │
│  │ CommandRegistry│ → │ Pipeline     │ → │ engine.Run()          │  │
│  │ slash/! cmds   │    │ middleware    │    │ LLM ↔ Tool loop      │  │
│  └──────────────┘    └──────────────┘    └──────────────────────┘  │
│         │                    │                     │                 │
│         │              system prompt           context mgmt         │
│         │              + memory injection      ├─ Offload (Layer 1) │
│         │              + tool catalog          ├─ Masking (Layer 2) │
│         │              + user message          ├─ ContextEdit (L3)  │
│         │                                     └─ Compress          │
│         ▼                                                           │
│  OutboundMessage → Dispatcher → Channel.Send()                      │
└─────────────────────────────────────────────────────────────────────┘
              │
     ┌────────┴────────┐
     ▼                 ▼
  SQLite (.xbot/xbot.db)    Vector DB (.xbot/vectordb/)
  sessions/memory/config    Archival memory embeddings

1. Overall Architecture

1.1 Entry Points

xbot has three independent executables:

BinaryEntry FilePurpose
xbot (server)main.goMain service: LLM Agent engine + multi-channel message gateway
xbot-clicmd/xbot-cli/main.goStandalone terminal chat UI (single user, direct filesystem access)
xbot-runnercmd/runner/main.goRemote sandbox runner: connects to server over WebSocket

1.2 Server Startup Flow

config.Load() → createLLM() → bus.NewMessageBus()
→ storage.MigrateIfNeeded() → OAuth init
→ tools.InitSandbox() → agent.New(cfg)        ← sandbox initialized before Agent (sync.Once)
→ register tools → IndexGlobalTools()
→ channel.NewDispatcher(msgBus) → register channels
→ start Agent event loop and channels

1.3 Package Map

xbot/
├── main.go                  # Server entry point
├── cmd/
│   ├── xbot-cli/            # Standalone CLI client
│   └── runner/              # Remote sandbox runner
├── config/                  # Configuration system (JSON + env vars)
├── bus/                     # Message bus (InboundMessage / OutboundMessage)
├── channel/                 # Channel layer (abstract + implementations + Dispatcher)
├── agent/                   # Agent core engine
├── llm/                     # LLM abstraction (OpenAI / Anthropic / Proxy)
├── session/                 # Multi-tenant session management
├── memory/                  # Memory system (flat / letta)
├── storage/
│   ├── sqlite/              # SQLite data service layer
│   └── vectordb/            # Vector database (chromem-go)
├── tools/                   # Tool system (including Worktree, config, tui_control)
├── plugin/                  # Plugin system (RPC bridge, runtimes, extension points)
├── internal/
│   ├── runnerclient/        # Runner client (runner-side)
│   └── runnerproto/         # Runner protocol definitions
├── serverapp/               # Server core, RPC handlers
├── oauth/                   # OAuth service
├── event/                   # Event system (webhook receive + trigger routing)
├── cron/                    # Scheduled task scheduler
├── crypto/                  # Encryption utilities
├── prompt/                  # Embedded prompt templates (go:embed)
├── version/                 # Version info
├── agents/                  # [runtime] Agent role definitions (~/.xbot/agents/), not a source package
└── web/                     # Frontend static assets (React 19 + Vite + TailwindCSS 4 + Tiptap)

1.4 Message Flow

User Message
  │
  ▼
Channel Implementation (feishu / qq / napcat / web / cli)
  │ Parse platform message → construct bus.InboundMessage
  │ Inject into bus.Inbound (cap=64 buffer)
  ▼
Agent.processMessage()
  │ 1. Normalize SenderID (singleUser mode → "default")
  │ 2. CommandRegistry.Match() — slash/! command dispatch
  │    ├─ Concurrent()=true → goroutine (does not consume semaphore)
  │    └─ Concurrent()=false → msgCh serial queue (cap=32)
  │ 3. Get or create TenantSession
  │ 4. Fetch message history + eager-save user message
  │ 5. Pipeline.Run() → assemble system prompt + history + user message
  │ 6. engine.Run()
  ▼
engine.Run() — core loop (see §2.2)
  │
  ▼
OutboundMessage → bus.Outbound → Dispatcher.Run()
  │
  ▼
Channel.Send(msg) → platform API

1.5 Channel Abstraction

Interface:

type Channel interface {
    Name() string
    Start() error
    Stop()
    Send(msg bus.OutboundMessage) (string, error)
}

Dispatcher: reads from bus.Outbound, routes by msg.Channel to the corresponding Channel. Supports observer pattern and SendDirect() for synchronous sends.

ChannelProtocolReconnect Strategy
FeishuLark SDK (WebSocket)SDK managed
QQQQ Bot WebSocketExponential backoff (1s→60s), fast disconnect detection (3 in 5s → 60s cooldown), Intent downgrade, Resume support
NapCatOneBot 11 WebSocketExponential backoff (same as QQ), 30s long-connection reset
WebHTTP + WebSocketClient-side reconnect; server-side WS Ping 30s, ReadDeadline 60s, offline ringBuffer(50)
CLITerminalN/A

1.6 Slash Commands & Bang Commands

CommandRegistry (agent/command.go): linear scan matching in registration order, first match wins.

Bang commands (! prefix): execute shell commands directly inside the sandbox, marked Concurrent()=true (does not consume semaphore). Timeout 120s, output > 4000 chars written to file.


2. Agent / Loop Architecture

2.1 Agent Struct

Agent (agent/agent.go) is the system core, holding:

  • bus — message bus reference
  • multiSession — multi-tenant session management
  • tools — tool registry
  • contextManager — context manager (interface, hot-swappable at runtime)
  • offloadStore — large result offload storage
  • maskStore — observation mask storage
  • contextEditor — context editor
  • llmFactory — user-custom LLM factory
  • skills/agents — skill/agent directories
  • registryManager — marketplace/sharing
  • settingsSvc — user settings
  • interactiveSubAgents — interactive SubAgent session pool
  • hookChain — tool execution hook chain (shared with SubAgents; replaced by agent/hooks.Manager)
  • bgTaskMgr — background task manager
  • pluginRegistry — plugin registry

2.2 Engine Run Loop

engine.Run() entry in agent/engine.go is the unified agent loop, shared by main Agent and SubAgents. Internal sub-functions split into agent/engine_run.go (runState, compress, LLM calls, etc.), tool execution and SubAgent spawning in agent/engine_wire.go.

RunConfig differentiated injection:

  • Main Agent: ToolExecutor=full (session MCP + hooks), ProgressNotifier=sendMessage, ContextManager=global, Memory=enabled, EnableReadWriteSplit=true
  • SubAgent: ToolExecutor=simplified, ProgressNotifier=nil, ContextManager=independent phase1, Memory=optional

Core loop logic (engine.go):

for i := 0; i < maxIter; i++ {
    ① maybeCompress()              ← 75% token threshold + 5 round cooldown
    ② LLM Generate()               ← perAttemptCtx 120s independent timeout
    ③ No tool_calls → exit loop
    ④ Tool execution (read/write split parallel or serial)
    ⑤ MaybeOffload()               ← Layer 1: large result offload
    ⑥ InvalidateStaleReads()       ← detect stale Read offloads
    ⑦ PurgeStaleMessages()         ← replace stale offloads with warnings
    ⑧ DynamicContext injection     ← stale detection results
    ⑨ System Reminder injection    ← strip old reminder each round
    ⑩ Incremental persist: Session.AddMessage()
}

Token count recovery: from in-memory lastPromptTokens (in-process), or from tenant_state table (after restart). Written back to DB + memory after Run completes.

Read/write split parallel (EnableReadWriteSplit):

  • Phase 0: SubAgent concurrent (constrained by SubAgentSem)
  • Phase 1: Read-only tools parallel (Read/Grep/Glob/WebSearch/ChatHistory, maxParallel=8)
  • Phase 2: Write tools serial

Read-only tool set (hardcoded readOnlyTools map; new additions require manual updates).

2.3 Pipeline Middleware Chain

Executed in priority order, assembling system prompt parts into the SystemParts map, finally concatenated by key dictionary order:

OrderMiddlewarePrioritySystemParts KeyResponsibility
1SystemPromptMiddleware000_baseRender prompt.md (hot reload)
2ProjectContextMiddleware504_global_context, 05_project_contextInject global (~/.xbot/AGENTS.md) + project (.xbot/AGENTS.md) instructions
3ChannelPromptMiddleware5channel-specific (e.g. 05_channel_xxx)Inject channel-specific system prompt parts
4SkillsCatalogMiddleware10010_skillsAvailable skills catalog
5AgentsCatalogMiddleware11015_agentsAvailable agents catalog
6PermissionControlMiddleware11514_perm_controlRead/write/tool permission control
7MemoryMiddleware12020_memoryLong-term memory (Recall)
8SenderInfoMiddleware13030_senderSender name
9LanguageMiddleware13532_languageLanguage instructions
10pluginEnricherMiddleware150plugin_enrichersPlugin-enriched system prompt content
11UserMessageMiddleware200Timestamp + guide text

CacheHint: system prompt marked "static", converted to cache_control: {type: "ephemeral"} under Anthropic.

2.4 Multi-Tenant Session Management

MultiTenantSession (session/multitenant.go): isolates sessions by (channel, chatID), internal cache supports 24h TTL expiry, no capacity limit.

TenantSession (session/tenant.go): holds tenantID, MemoryProvider, SessionMCPManager, CWD state.

Tenant isolation:

  • IM users: (channel, chatID) → positive tenantID
  • SubAgent: SHA-256 hash → negative tenantID

2.5 SubAgent Mechanism

One-shot SubAgent: created via bus.InboundMessage{Channel="agent", AllowedTools, SystemPrompt, RoleName}. Optional capabilities: memory, send_message, spawn_agent.

Interactive SubAgent: persistent sessions stored in sync.Map, supporting multi-turn dialog, 30 min inactivity auto-cleanup. Key format: channel:chatID/roleName[:instance].


3. Context Architecture

3.1 Message Types

type ChatMessage struct {
    Role             string     // "system" / "user" / "assistant" / "tool"
    Content          string     // LLM-visible content
    ReasoningContent string     // DeepSeek/OpenAI reasoning models
    ToolCallID/Name/Arguments   // Tool call/result identifiers
    ToolCalls        []ToolCall // Assistant's tool call list
    Detail           string     // Full content (persisted to DB, used by Web/CLI display)
    DisplayOnly      bool       // Display only (not loaded into LLM context)
    CacheHint        string     // "static" = cacheable across requests
}

3.2 Four-Layer Context Management Strategy

The four layers form a defense-in-depth strategy, triggered at different points in engine.Run():

Single tool result too large
    ↓ exceeds 2000 tokens or 10240 bytes
Layer 1: Offload → replace with 📂 [offload:ol_xxxx] + summary
    ↓ auto-marked STALE when file modified

Too many old tool results
    ↓ triggered after each compress (token reaches 60%)
Layer 2: Masking → replace with 📂 [masked:mk_xxxx] + summary
    ↓ grouped by consecutive tool groups, keep recent active groups

Overall context too large
    ↓ totalTokens >= 75% maxTokens
Layer 3: Compress → LLM structured compression

LLM-initiated editing
    ↓ any time
Layer 4: ContextEdit → precise delete/truncate/replace

Layer 1: Offload (Large Result Offload)

  • Trigger: MaybeOffload() — single tool result exceeds threshold
  • Storage: filesystem .xbot/offload_store/{sessionKey}/id.json
  • Summary generation: rule-based summary per tool type (Read/Grep/Shell/Glob/default)
  • Read special: computes ContentHash (SHA-256) for stale detection
  • Stale detection: re-read file after each tool execution, compare hash; mismatch → replace with expiry warning
  • Recall: offload_recall tool, supports pagination (offset/limit, max 16000 runes)
  • Anti-recursion: results from offload_recall and recall_masked are not offloaded again
  • Read offset protection: Read results with offset/limit are not offloaded
  • SubAgent path: SubAgent offload data stored under parent session directory (RootSessionKey)

Layer 2: Observation Masking

  • Trigger: MaskOldToolResults() — called inside maybeCompress() (token reaches 60%)
  • Grouping strategy: assistant[tool_calls] + consecutive tool messages = 1 group
  • Retention strategy: keepGroups dynamically computed by ratio (3/5/8/12 groups)
  • Active file protection: files touched in last 3 turns are not masked
  • Collapse optimization: pure tool groups (assistant.Content empty) collapsed to a single message pair, reducing message count
  • Storage: in-memory ObservationMaskStore, FIFO eviction, dual limit (200 entries / 2MB chars)
  • Recall: recall_masked tool
  • Post-compress cleanup: CleanOldEntries(cutoff) removes records prior to compression point

Layer 3: Context Edit

  • Tool: context_edit — called by LLM proactively
  • Operations: delete_turn / delete / truncate / replace (regex support)
  • Safety: cannot edit system messages, cannot delete last 3 messages

3.3 Context Compression

Trigger conditions (four paths):

PathConditionMethod
Auto compresstotalTokens >= 75% maxTokens + 5 rounds cooldowncm.Compress()
Input too longLLM returns InputTooLong errorcm.ManualCompress()
Context window exceededfinish_reason = context_window_exceededcm.Compress()
Manual /compressUser commandcm.ManualCompress()

Phase 1 compression flow:

  1. Find tail cut point (last user/assistant message)
  2. Select messages to compress from tail backwards by token budget
  3. LLM call (max 10 multi-turn rounds, may call memory tools to organize memories)
  4. Build results: LLMView + SessionView (excluding system/display_only)
  5. Persistence: Session.Clear() + per-message AddMessage(SessionView)
  6. Cleanup: delete offload files and mask entries prior to compression point

3.4 Message Persistence

SessionService: appends to session_messages table.

TimingMethod
User messageeager-save before Run()
After each iterationincremental persist messages[lastPersistedCount:] (skip system + strip reminder)
After compressionSession.Clear() + per-message AddMessage(SessionView)
After maskingUpdateMessageContent(idx) in-place update
After cancelsave interrupted iteration progress
After Run() returnssave final assistant reply

GetHistory: uses user messages as turn boundaries, fetches messages starting from the Nth user message.


4. Storage Architecture

4.1 SQLite

Single SQLite file, WAL mode, MaxOpenConns=1 (SQLite single-writer model), BusyTimeout=5000ms. Schema Version 21.

TablePurpose
tenantsTenants (channel+chatID → tenantID)
session_messagesSession message history
tenant_stateTenant state (last_consolidated, token counts)
long_term_memoryLong-term memory (flat mode)
event_historyEvent history (Letta recall, FTS5 index)
user_profilesUser profiles
core_memory_blocksCore memory blocks (Letta persona/human/working_context)
archival_memoryArchival memory (vector index BLOB)
cron_jobsScheduled tasks
runnersRunner management
shared_registrySkill/Agent marketplace
web_usersWeb user authentication
user_settingsUser settings
user_llm_configsUser custom LLM configuration
user_token_usageUser token usage statistics
schema_versionSchema version

Data migration: storage.MigrateIfNeeded() tracks version via schema_version table, executing incremental migrations step by step.

DB path: Server: {workDir}/.xbot/xbot.db, CLI: ~/.xbot/xbot.db. Data is not shared between the two modes.

4.2 Memory System

MemoryProvider interface:

type MemoryProvider interface {
    Recall(ctx, query) (string, error)
    Memorize(ctx, MemorizeInput) (MemorizeResult, error)
    Close() error
}
ModeRecallMemorizeUse Case
flatReturns long_term_memory full text directlyLLM compresses conversation → writes to long_term_memorySimple scenarios
lettaReturns Core Memory structured blocksLLM rethink → update Core + write to ArchivalComplex long-term memory

Memorize trigger timing (⚠️ NOT triggered after every conversation):

  • /new command (session switch) — ArchiveAll=true
  • SubAgent exit — ArchiveAll=true
  • During compression, called by LLM via memory tools

Letta Memory:

  • Core Memory: 3 structured blocks (persona / human / working_context), injected into system prompt
  • Archival Memory: embedding vector storage, retrieved on-demand via tools, with deduplication (search similarity > 0.5)
  • Recall Memory: time-range-based session history search (FTS5)

4.3 Vector Database

Uses chromem-go embedded vector database, persisted to .xbot/vectordb/. Supports OpenAI and Ollama embedding providers, with automatic truncation of overlong content.


5. Tool System

5.1 Tool Categories

CategoryTools
File OperationsRead, FileCreate, FileReplace, Glob, Grep, Cd, DownloadFile
Command ExecutionShell
SearchWebSearch, ChatHistory
Memorymemory_write, memory_list (flat), core_memory_, archival_memory_, recall_memory_search (letta)
Context Managementoffload_recall, recall_masked, context_edit
SubAgent & CollaborationSubAgent (one-shot + Interactive), CreateChat, SendMessage, Worktree
Skill/AgentSkill, ManageTools, search_tools
AI-Native Configconfig (AI reads/writes config), tui_control (AI operates TUI)
Cardscard_create, card_add_content, card_add_interactive, card_add_container, card_preview, card_send
Taskstodo_write/todo_list (cross-session persistent), Cron, task_status, task_kill, task_read
ManagementLogs, AskUser, oauth_authorize, EventTrigger
Feishu MCP20+ tools (Bitable, Knowledge Base, Docs, Drive)

5.2 Hooks System

The old ToolHook/HookChain has been replaced by Manager in agent/hooks/.

17 lifecycle events: SessionStart, SessionEnd, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, PostToolBatch, PermissionRequest, PermissionDenied, SubAgentStart, SubAgentStop, AgentStop, AgentError, PreCompact, PostCompact, CronFired, WebhookReceived

4 handler types: command (shell command), http (HTTP POST), mcp_tool (MCP tool call), callback (Go function)

Decision priority (multi-handler conflicts): deny > defer > ask > allow

Configuration: three-layer merge (~/.xbot/hooks.json<project>/.xbot/hooks.json<project>/.xbot/hooks.local.json), project-level can be committed to git.

5.3 Tool Activation/Expiration

coreTools vs sessionActivated:

AttributecoreToolssessionActivated
Appear in tool definitionsAlwaysWhen activated and not expired
ExpiryNeverAuto-expire after 5 consecutive rounds unused
Flat modeSame as aboveRegister = RegisterCore, never expires
  • TickSession(): increment round on each new user message
  • TouchTool(): refresh active time before tool execution
  • DeactivateSession(): cleanup when session evicted from cache

5.4 MCP Integration

Global MCP: GetMCPCatalog() tool name prefix mcp_{server}_{tool}. Session MCP: SessionMCPManager per-user independent, lazy-loaded (connect on first access), unload after 30 min inactivity.

Stub mechanism: unactivated MCP tools return Parameters() = nil (LLM invisible); activated = expose full schema. search_tools can fetch full schema for semantic search.

5.5 Sandbox System

Sandbox interface: Name, Workspace, Exec, ReadFile, WriteFile, Stat, ReadDir, MkdirAll, Close.

SandboxRouter (tools/sandbox_router.go): unified sandbox entry, routes per-user to different backends. Implements Sandbox and SandboxResolver interfaces. Routing rules (per-user, determined by user_settings.active_runner):

  • active_runner == "__docker__" → DockerSandbox (if enabled)
  • active_runner == specific remote runner name → corresponding RemoteSandbox connection (if connected)
  • Fallback: Remote → Docker → None

Supports simultaneously holding Docker and Remote instances (dual-mode), routes independently per user. Users can switch active runner in settings panel.

Routing strategy: Remote Runner → Docker → None priority.

Remote Runner / Multi-Runner: RemoteSandbox supports multiple simultaneous runner connections, each with independent name and token. Runners connect via WebSocket, token auth (subtle.ConstantTimeCompare), runners table manages tokens. Users select runner by name in settings (active_runner). Sync configuration supports syncing skills/agents directories to runner side.

DockerSandbox: per-user independent container, stop (not rm) for reuse on next session, DinD mode path translation.

RemoteSandbox: WebSocket server waits for runner connections, token auth (subtle.ConstantTimeCompare), request-response pattern, supports stdio streaming output and runner-local LLM.


6. Configuration System

6.1 Config Loading Priority

1. .env file (godotenv)              ← lowest
2. config.json (LoadFromFile, merge non-zero fields)
3. Env var overrides (applyEnvOverrides)  ← highest

6.2 XBOT_HOME & Path Resolution

func XbotHome() string {
    dir := os.Getenv("XBOT_HOME")       // priority 1: env var
    if dir == "" {
        dir = filepath.Join(home, ".xbot")  // priority 2: ~/.xbot
    }
    os.MkdirAll(dir, 0o755)
    return dir
}

Used for: config.json, xbot.db, offload_store/, skills/, agents/.

6.3 Runtime Configuration

  • User settings: user_settings table, modified via Web/Feishu/TUI settings panels
  • LLM subscriptions: user_llm_subscriptions table (Server mode) or config.json subscriptions array (CLI mode) — single source of truth for LLM config. Supports Model Tiers (vanguard/balance/swift) and runtime switching
  • AI-Native config: config tool lets Agent read/write config directly; tui_control tool lets Agent operate the TUI
  • Context mode: hot-swappable at runtime (SetContextMode)
  • MaxConcurrency: adjustable at runtime (settings panel)

7. Error Handling & Fault Tolerance

7.1 LLM Retry Strategy

ParameterDefaultEnv Var
Max attempts5LLM_RETRY_ATTEMPTS
Initial delay1sLLM_RETRY_DELAY
Max delay30sLLM_RETRY_MAX_DELAY
Single call timeout120sLLM_RETRY_TIMEOUT
  • Backoff: exponential backoff + random jitter + 429 extra backoff (2^(min(n,4)) s)
  • perAttemptCtx: each retry creates a fresh 120s timeout context, does not inherit parent ctx deadline, but propagates cancellation signal
  • Retryable: 429 / 5xx / network errors / timeout; non-retryable: context.Canceled / 4xx
  • Input too long: detects InputTooLong, forces compression, bypasses retry framework
  • Retry notification: pushes progress message via RetryNotifyFunc: ⚠️ LLM request failed (rate limit), retrying 2/5 ...

7.2 Per-Tenant LLM Concurrency

LLM TypeDefault Concurrency Cap
Global LLM5
User Custom LLM3
SubAgent3

Controlled by LLMSemaphoreManager per-tenant semaphore. Released immediately after Generate completes (no defer).

7.3 Tool Execution Failure Handling

  • execErr: error converted to tool message "Error: <details>\nPlease fix the issue and try again.", loop continues
  • IsError: hint "Do NOT retry the same command. Analyze the error, fix the root cause."
  • Timeout: Shell default 120s (max 600s); on timeout auto-converts to background task (BgTaskManager.Adopt/Start)
  • OAuth: matches OAuth flow, returns alternative content, enters wait-for-user-interaction state

7.4 Message Bus

Inbound/Outbound both have cap=64 buffered channels. No backpressure handling — senders block when full. Web channel’s sendToClient() uses select + default non-blocking write, degrading to offline ringBuffer(50) when full.


8. Security Model

8.1 Encrypted Storage

Tokens stored as plaintext by default (DB colocated with server process). If XBOT_ENCRYPTION_KEY is set, encryption still applies (AES-256-GCM, backward compatible with old data).

Encrypted data: OAuth access_token/refresh_token, user LLM api_key.

8.2 Authentication

MechanismImplementation
Web loginbcrypt hash, in-memory sessions (30 days), HttpOnly + SameSite=Lax
Runner token256-bit random, subtle.ConstantTimeCompare, dual verification (tokenStore + authToken)
AdminSingle ADMIN_CHAT_ID (fallback to STARTUP_NOTIFY_CHAT_ID), Logs tool permission check
AllowFromExact match whitelist (comma-separated), empty = allow all
OAuth CSRF16-byte random state token

8.3 Input Validation

  • PathGuard (internal/runnerclient/pathguard.go): filepath.EvalSymlinks resolve symlinks then check prefix
  • Docker container names: regex validation ^[a-z0-9][a-z0-9_.-]{0,127}$
  • Process groups: Setpgid=true, kill entire process group on timeout

9. Performance & Rate Limiting

9.1 Five-Layer Throttling

Layer 1: Global concurrency semaphore (MaxConcurrency=3)
Layer 2: Per-Chat queue (cap=32, drop when full)
Layer 2.5: Per-user semaphore (custom LLM users, cap=1)
Layer 3: LLM Per-Tenant semaphore (Global=5, Personal=3, SubAgent=3)
Layer 4: LLM retry + single timeout (120s × 5 = max 600s)
Layer 5: Tool timeout (120s) + background task (24h safety ceiling)

9.2 Timeout Reference

ComponentTimeout
Shell tooldefault 120s, max 600s
Docker commands30s (slow ops 120s)
Remote exec60s + 5s
Remote WebSocketPing 30s, Pong 60s
Web HTTPRead 10s, Write 60s, Idle 120s
Web WebSocketPing 30s, ReadDeadline 60s
Background tasks24h safety ceiling, 50KB output truncation
LLM single call120s
LLM single call (Anthropic HTTP Client)300s

See also