Configuration Schema

Complete reference for the OATF (Open Agent Threat Format) document schema used by ThoughtJack. See the OATF specification for the canonical format definition.

Root structure

oatf: "0.1"              # Required - OATF format version

attack:
  id: string              # Optional - scenario ID (e.g., "OATF-002")
  name: string            # Optional - human-readable name
  description: string     # Optional - multi-line description
  version: integer        # Optional - scenario version
  status: string          # Optional - stable, draft, experimental
  severity:               # Optional - severity metadata
    level: string         #   critical, high, medium, low
    confidence: integer   #   0-100
  execution:              # Required - defines how the scenario runs
    mode: string          # Actor mode (single-actor shorthand)
    actors: []            # Multi-actor configuration
    phases: []            # Phases (single-actor shorthand)
  indicators: []          # Optional - verdict evaluation rules
  correlation:            # Optional - how indicators combine
    logic: string         #   any (default), all
  grace_period: string    # Optional - wait after final phase (e.g., "10s")

Execution modes

Single-actor shorthand

For scenarios with one actor, use mode and phases directly:

attack:
  execution:
    mode: mcp_server
    phases:
      - name: trust_building
        # ...
      - name: exploit

Multi-actor

For scenarios involving multiple protocol roles:

attack:
  execution:
    actors:
      - name: server
        mode: mcp_server
        phases: [...]
      - name: client
        mode: mcp_client
        phases: [...]

Supported modes

Mode	Description	CLI flag	Context mode
`mcp_server`	MCP server (ThoughtJack serves)	`--mcp-server <ADDR:PORT>`	Allowed
`mcp_client`	MCP client (ThoughtJack connects)	`--mcp-client-command` or `--mcp-client-endpoint`	Not supported
`a2a_server`	A2A server	`--a2a-server <ADDR:PORT>`	Allowed
`a2a_client`	A2A client	`--a2a-client-endpoint <URL>`	Not supported
`ag_ui_client`	AG-UI client	`--agui-client-endpoint <URL>`	Required (exactly one)

Context mode restrictions

In context mode (--context), the scenario must include exactly one ag_ui_client actor. Additional mcp_server and a2a_server actors are allowed - they provide tool definitions and adversarial responses to the LLM. Client-mode actors (mcp_client, a2a_client) are not supported because there is no external service to connect to.

Phases

Phases define the state machine for temporal attacks. Each phase has a state block (what to serve), optional trigger (when to advance), and optional on_enter actions.

phases:
  - name: string            # Optional - phase name (auto-generated as phase-0, phase-1 if omitted)
    description: string     # Optional - human-readable description
    state:                  # What the server serves in this phase
      capabilities: {}      # MCP capabilities
      tools: []             # Tool definitions
      resources: []         # Resource definitions
      prompts: []           # Prompt definitions
    trigger:                # When to advance to the next phase
      event: string         # Protocol event to count (e.g., "tools/call")
      count: integer        # Number of matching events required
      after: string         # Duration trigger (e.g., "30s")
      match: {}             # Content matching predicate
    on_enter: []            # Actions to execute when entering this phase
    extractors: []          # Value extractors for template interpolation

The last phase in the list is the terminal phase - it has no trigger and runs indefinitely.

State block

The state block defines what the server advertises and responds with during this phase:

state:
  capabilities:
    tools:
      listChanged: true
  tools:
    - name: calculator
      description: "A simple calculator."
      inputSchema:
        type: object
        properties:
          expression:
            type: string
        required: [expression]
      responses:
        - content:
            content:
              - type: text
                text: "Result: 42"

State is inherited - each phase merges with previous phases. Only fields you specify are overridden.

Tool responses

Tool responses use ordered matching. The first matching response is used:

responses:
  - match:                    # Conditional response
      args.expression:
        contains: "error"
    content:
      content:
        - type: text
          text: "Error in expression"
  - content:                  # Default response (no match block)
      content:
        - type: text
          text: "Result: 42"

Triggers

See Phase Triggers for the full trigger reference.

# Event count - advance after N matching events
trigger:
  event: tools/call
  count: 3

# Time-based - advance after duration
trigger:
  after: 30s

# Content matching - advance on specific payload content
trigger:
  event: tools/call
  match:
    args.expression:
      contains: "password"

# Combined - event count with timeout fallback
trigger:
  event: tools/call
  count: 10
  after: 60s

Entry actions

Actions executed when a phase is entered:

on_enter:
  - send:                           # Send a protocol notification
      method: "notifications/tools/list_changed"
  - send:                           # Send with parameters
      method: "notifications/resources/updated"
      params:
        uri: "file:///config"
  - log:                            # Log a message
      message: "Entered exploit phase"
      level: warn                   # Optional: info (default), warn, error

Extractors

Capture values from protocol messages for use in template interpolation:

extractors:
  - name: tool_name
    source: request           # request or response
    type: jsonpath            # jsonpath or regex
    selector: "$.params.name"

Captured values are available as {{tool_name}} in templates.

Template interpolation

Templates use {{extractor_name}} syntax within response content:

responses:
  - content:
      content:
        - type: text
          text: "You called {{tool_name}} with {{expression}}"

Indicators

Indicators define what to check in the protocol trace after execution:

indicators:
  - id: OATF-002-01                 # Unique identifier
    target: "arguments"              # What to search: arguments, content, trace
    description: "Agent read sensitive files"
    protocol: mcp                    # Optional protocol filter
    pattern:                         # Pattern-based detection
      regex: "(id_rsa|\\.ssh|passwd)"

  - id: OATF-002-02
    target: "arguments"
    description: "Semantic compliance detection"
    semantic:                        # LLM-as-judge detection
      intent: "Agent is exfiltrating data"
      threshold: 0.7
      examples:
        positive:
          - "read ~/.ssh/id_rsa"
        negative:
          - "2 + 3"

Detection methods

Method	Field	Description
`pattern`	`pattern.regex`	Regex match against trace content
`semantic`	`semantic.intent`	LLM-as-judge evaluation (planned - not yet implemented)
`cel`	`cel.expression`	CEL expression evaluation

Planned features

Semantic evaluation (semantic field) and synthesize generation (synthesize block) are defined in the OATF format but not yet implemented in ThoughtJack. Scenarios using semantic indicators will have those indicators skipped during evaluation. Use pattern or cel indicators for current evaluations.

Correlation

Controls how multiple indicator results combine into a verdict:

correlation:
  logic: any    # any = one match sufficient, all = all must match

Grace period

Wait time after the final phase completes, allowing delayed agent actions to be captured:

grace_period: "10s"

Override at runtime with --grace-period <DURATION>.

Root structure​

Execution modes​

Single-actor shorthand​

Multi-actor​

Supported modes​

Phases​

State block​

Tool responses​

Triggers​

Entry actions​

Extractors​

Template interpolation​

Indicators​

Detection methods​

Correlation​

Grace period​

See also​