AI Agent — Configuration

This page is the reference for every knob the agent exposes. Pair it with Getting Started for a hands-on walkthrough.

The agent reads its configuration from the same config.yaml Versus uses for the rest of its features. Everything lives under the top-level agent: key. The list of log sources lives in a separate file (default agent_sources.yaml) so it can be managed independently.

Reminder. The agent is off by default (agent.enable: false). Nothing about the agent runs until that flag flips.


Top-level keys

agent:
  enable: false
  mode: training
  poll_interval: 30s
  lookback: 5m
  batch_max: 1000
  signal_max_bytes: 8192
  gateway_secret: ${AGENT_GATEWAY_SECRET}
  data_dir: ./data
  sources_path: ./agent_sources.yaml
  redaction:   { … }
  catalog:     { … }
  miner:       { … }
  regex:       { … }
KeyTypeDefaultDescription
enableboolfalseMaster switch. Env: AGENT_ENABLE.
modestringtrainingOne of training | shadow | detect. Env: AGENT_MODE.
poll_intervalduration30sHow often each source is pulled. Lower = more responsive, higher = less load on your log backend.
lookbackduration5mInitial backfill window on first start (when there's no cursor yet).
batch_maxint1000Safety cap on signals processed per tick per source.
signal_max_bytesint8192Truncates a single signal's Raw payload above this size.
gateway_secretstring(empty)Shared secret required on X-Gateway-Secret header for /api/agent/*. Empty disables the admin endpoints entirely, which means the agent cannot start. Env: AGENT_GATEWAY_SECRET.
data_dirpath./dataWhere the agent persists its catalog and source cursors.
sources_pathpath(empty)External YAML file containing the sources: list. Resolved relative to the main config. Env: AGENT_SOURCES_PATH.

Modes

ModeWhat it doesWhen to use
trainingObserves only. New patterns are added to the catalog; nothing else.First few days. Until the catalog stabilizes.
shadowSame as training, but logs agent[shadow]: would alert … for any signal it would have alerted on.A release cycle of review before going live.
detectTreats unknown signals as anomalies. AI summarization + incident emission ships in a follow-up milestone — today this mode logs the verdict only.Production, after you trust the catalog.

Environment overrides

Env varMaps to
AGENT_ENABLEagent.enable
AGENT_MODEagent.mode
AGENT_GATEWAY_SECRETagent.gateway_secret
AGENT_SOURCES_PATHagent.sources_path

redaction

Pattern-based scrubbing of secrets and PII before any other component sees them. Always enable this in production. See Redaction for the full default rule list and how to extend it.

redaction:
  enable: true
  redact_ips: false
  extra_patterns:
    - "(?i)password=\\S+"
    - "Authorization:\\s*Bearer\\s+\\S+"
KeyTypeDefaultDescription
enablebooltrue (when agent.enable: true)Master switch for redaction.
redact_ipsboolfalseOpt-in IPv4/IPv6 redaction. Off by default because IPs are usually useful context.
extra_patternsstring list[]Additional Go regexes. Invalid patterns are skipped (logged at startup), so one typo can't disable redaction.

catalog

Long-term storage for learned patterns. See Catalog for the schema and admin workflows.

catalog:
  mode: file
  persist_interval: 30s
  auto_promote_after: 100
KeyTypeDefaultDescription
modestringfileStorage backend. Only file is supported today (planned: redis, database).
persist_intervalduration30sHow often the in-memory catalog is flushed to data_dir/patterns.json.
auto_promote_afterint100A pattern seen this many times in detect mode is treated as known (won't alert). 0 disables the promotion.

The on-disk filename is fixed (patterns.json); the only configurable part is data_dir. The file is written atomically (tmp + rename) and five rotated backups are kept (patterns.json.1.5).


miner

Drain-style log clusterer. The defaults work for most setups; tune only if you see related lines failing to merge into one template (lower similarity_threshold) or unrelated lines collapsing together (raise it). See Miner.

miner:
  similarity_threshold: 0.4
  tree_depth: 4
  max_children: 100
KeyTypeDefaultDescription
similarity_thresholdfloat (0–1)0.4Token-overlap ratio required to consider two messages part of the same template.
tree_depthint4Depth of the prefix tree used to bucket templates by length and leading tokens.
max_childrenint100Per-node fan-out cap to keep the tree bounded.

regex

Pre-filter and tagger. Only signals whose message matches at least one rule (named or default) are forwarded to the miner — everything else is dropped before clustering. See Regex for cookbook recipes.

regex:
  default_pattern: "(?i).*error.*"
  rules:
    - name: oom-killer
      pattern: "Out of memory: Killed process"
      severity: critical
    - name: panic
      pattern: "(?i)panic:"
      severity: critical
    - name: 5xx-burst
      pattern: "HTTP/[0-9.]+\\s+5\\d\\d"
      severity: high
KeyTypeDefaultDescription
default_patternregex""Catch-all tried after every named rule misses. Empty = nothing matches by default → strict mode. ".*" = learn from every line.
ruleslist[]Named rules. First match wins. Each rule has name, pattern, severity.

Common recipes:

GoalSetting
Learn everything (training, broad scope)default_pattern: ".*"
Only learn explicit rule matches (strict)default_pattern: "" plus full rules: list
Learn only error-ish lines (default)default_pattern: "(?i).*error.*"

Signal sources

The list of log sources lives in a separate file referenced by agent.sources_path. Versus reads it at startup, expands ${ENV_VAR} references inside it, and replaces agent.sources in memory. Keeping sources separate makes it easy to swap fixtures (local file ↔ ES) and manage per-environment lists without touching the rest of the config.

# config/agent_sources.yaml
sources:
  - name: my-app
    type: file
    enable: true
    file:
      path: /var/log/my-app/app.log
      format: text
      from_beginning: false   # tail-like behavior

  - name: prod-app
    type: elasticsearch
    enable: false
    elasticsearch:
      addresses:
        - https://es.prod.example:9200
      username: ${ES_USERNAME}
      password: ${ES_PASSWORD}
      index: "logs-app-*"
      time_field: "@timestamp"
      query: 'log.level:(error OR warn)'
      message_field: message
      page_size: 500

Common keys for every source:

KeyTypeDescription
namestringUnique identifier. Used in cursor keys and admin views.
typestringfile or elasticsearch.
enableboolPer-source switch.

file source

Cheapest way to test the agent end-to-end. One source = one file (no globs). Tracks position via a sidecar cursor file, so it survives restarts and handles log rotation (a shrinking file resets to offset 0).

file:
  path: /app/logs/my-app.log
  format: text                # "text" or "json"
  from_beginning: true        # replay-like; false tails from EOF
  cursor_path: ""             # default: <data_dir>/cursors/file-<name>.cursor
  max_line_bytes: 65536
  timestamp_layout: ""        # Go time layout; empty = auto
  # JSON-mode only:
  message_field: message
  timestamp_field: "@timestamp"
  severity_field: level

elasticsearch source

Reads through the _search API with a range filter on time_field. Uses sort + search_after for stable pagination. Authenticates with HTTP basic auth or an API key.

elasticsearch:
  addresses:                  # any number; first that responds wins
    - https://es.prod.example:9200
  username: ${ES_USERNAME}
  password: ${ES_PASSWORD}
  api_key: ""                 # alternative to user/pass
  insecure_skip_verify: false
  index: "logs-app-*"         # supports wildcards
  time_field: "@timestamp"
  query: 'log.level:(error OR warn)'   # Lucene-style query string; "*" = match all
  message_field: message
  severity_field: log.level
  extra_fields:
    - service.name
    - host.name
    - error.stack_trace
  page_size: 500

Tip on lookback. The agent's first poll uses since = now - lookback. ES queries with very large historical windows may hit a lot of data on the first tick — start with the default 5m and only increase if you need to backfill.


Admin endpoints

All /api/agent/* endpoints require the X-Gateway-Secret header to match agent.gateway_secret. With no secret configured the endpoints are not registered and the agent refuses to start.

MethodPathDescription
GET/api/agent/statusCatalog size, dirty flag, persist-interval, mode.
GET/api/agent/patternsAll patterns, sorted by sighting count desc.
GET/api/agent/patterns/:idOne pattern.
POST/api/agent/patterns/:idUpdate severity, label, and/or tags.
DELETE/api/agent/patterns/:idRemove a pattern from the catalog.
POST/api/agent/flushForce-flush the in-memory catalog to disk.

Example:

curl -H "X-Gateway-Secret: $AGENT_GATEWAY_SECRET" \
  -H 'Content-Type: application/json' \
  -d '{"severity":"known","label":"deploy-rollout","tags":["benign"]}' \
  http://localhost:3000/api/agent/patterns/p-abc123

Where to next