CLI Reference

Complete reference for the thoughtjack command-line interface. See TJ-SPEC-007 for the formal specification.

Global flags

Flag	Short	Description
`--verbose`	`-v`	Increase verbosity (repeat for more: `-v` info, `-vv` debug, `-vvv` trace)
`--quiet`	`-q`	Suppress all non-error output
`--color <when>`		Color output: `auto`, `always`, `never`
`--log-format <format>`		Log format: `human` (default), `json`. Env: `THOUGHTJACK_LOG_FORMAT`

Commands

`run`

Execute an OATF scenario against a target agent.

thoughtjack run <oatf.yaml>

Flag	Env Variable	Description
`<SCENARIO>`	`THOUGHTJACK_SCENARIO`	Path to OATF scenario YAML document (positional)
`--mcp-server <ADDR:PORT>`		MCP server HTTP listen address (omit for stdio)
`--mcp-client-command <CMD>`		Spawn MCP client by running a command
`--mcp-client-args <ARGS>`		Extra arguments for `--mcp-client-command`
`--mcp-client-endpoint <URL>`		Connect MCP client to an HTTP endpoint
`--agui-client-endpoint <URL>`		Connect AG-UI client to an endpoint
`--a2a-server <ADDR:PORT>`		A2A server listen address [default: 127.0.0.1:9090]
`--a2a-client-endpoint <URL>`		A2A client target endpoint
`--grace-period <DURATION>`		Override document grace period
`--max-session <DURATION>`		Safety timeout for entire session [default: 5m]
`--readiness-timeout <DURATION>`		Timeout for server readiness gate [default: 30s]
`-o, --output <PATH>`		Write JSON verdict to file (use `-` for stdout)
`--header <KEY:VALUE>`		HTTP headers for client transports (repeatable)
`--no-semantic`		Disable semantic (LLM-as-judge) indicator evaluation (not yet implemented)
`--raw-synthesize`		Bypass synthesize output validation
`--progress <LEVEL>`	`THOUGHTJACK_PROGRESS`	Progress output: `off`, `on`, `auto` (default: `auto` - on for TTY, off otherwise)
`--metrics-port <port>`	`THOUGHTJACK_METRICS_PORT`	Enable Prometheus metrics endpoint
`--events-file <path>`	`THOUGHTJACK_EVENTS_FILE`	Write structured events to JSONL file
`--export-trace <PATH>`	`THOUGHTJACK_EXPORT_TRACE`	Write full protocol trace to JSONL file (use `-` for stdout)

Context-mode flags

Enable context mode to call an LLM API directly instead of running real protocol infrastructure.

Flag	Env Variable	Description
`--context`		Enable context mode
`--context-model <MODEL>`	`THOUGHTJACK_CONTEXT_MODEL`	LLM model identifier (required with `--context`)
`--context-api-key <KEY>`	`THOUGHTJACK_CONTEXT_API_KEY`	API key for LLM provider
`--context-base-url <URL>`	`THOUGHTJACK_CONTEXT_BASE_URL`	Override provider's default base URL
`--context-provider <TYPE>`	`THOUGHTJACK_CONTEXT_PROVIDER`	Provider type: `openai` (default), `anthropic`
`--context-temperature <FLOAT>`	`THOUGHTJACK_CONTEXT_TEMPERATURE`	Sampling temperature [default: 0.0]
`--context-max-tokens <TOKENS>`	`THOUGHTJACK_CONTEXT_MAX_TOKENS`	Max tokens per LLM response [default: 4096]
`--context-system-prompt <PROMPT>`	`THOUGHTJACK_CONTEXT_SYSTEM_PROMPT`	System prompt for simulated agent
`--context-timeout <SECONDS>`	`THOUGHTJACK_CONTEXT_TIMEOUT`	Per-request LLM timeout [default: 120]
`--max-turns <N>`		Maximum conversation turns in context mode [default: 20]

`validate`

Validate an OATF document without running it.

thoughtjack validate <path> [--normalize]

Flag	Description
`<path>`	OATF document file path (positional, required)
`--normalize`	Print the pre-processed (normalized) document YAML

`scenarios list`

List all built-in attack scenarios.

thoughtjack scenarios list [--category <cat>] [--tag <tag>] [--format human|json]

Flag	Description
`--category <cat>`	Filter by category
`--tag <tag>`	Filter by tag
`--format <format>`	Output format: `human` (default), `json`

`scenarios show`

Show the YAML configuration for a built-in scenario.

thoughtjack scenarios show <name>

Flag	Description
`<name>`	Scenario name (positional, required). Supports fuzzy matching.

`scenarios run`

Run a built-in scenario by name.

thoughtjack scenarios run <name> [run flags...]

Accepts all run flags after the scenario name (no positional scenario path). Uses the built-in scenario YAML.

`version`

Display version and build information.

thoughtjack version [--format human|json]

Environment variables

Variable	Default	Description
`THOUGHTJACK_SCENARIO`	-	Default OATF scenario file path
`THOUGHTJACK_METRICS_PORT`	-	Prometheus metrics port
`THOUGHTJACK_EVENTS_FILE`	-	Structured event output file
`THOUGHTJACK_PROGRESS`	`auto`	Progress output (`off`, `on`, `auto`)
`THOUGHTJACK_COLOR`	`auto`	Color output control
`THOUGHTJACK_LOG_LEVEL`	-	Override log level (trace, debug, info, warn, error)
`NO_COLOR`	-	Disable color output (any value)
`THOUGHTJACK_LOG_FORMAT`	`human`	Log format (`human`, `json`)
`THOUGHTJACK_EXPORT_TRACE`	-	Default export trace file path
`THOUGHTJACK_CONTEXT_API_KEY`	-	API key for context-mode LLM provider
`THOUGHTJACK_CONTEXT_BASE_URL`	-	Override LLM provider base URL
`THOUGHTJACK_CONTEXT_MODEL`	-	LLM model identifier for context mode
`THOUGHTJACK_CONTEXT_PROVIDER`	`openai`	LLM provider type (`openai`, `anthropic`)
`THOUGHTJACK_CONTEXT_SYSTEM_PROMPT`	-	System prompt for context mode
`THOUGHTJACK_CONTEXT_TEMPERATURE`	-	Sampling temperature for context mode
`THOUGHTJACK_CONTEXT_MAX_TOKENS`	`4096`	Max tokens per LLM response
`THOUGHTJACK_CONTEXT_TIMEOUT`	`120`	Per-request LLM timeout in seconds

Exit codes

Exit codes encode both the verdict result and the attack severity tier. When an indicator with a tier field matches, the exit code reflects the highest tier. See Execution Modes for details.

Code	Name	Description
0	`not_exploited`	Agent was not exploited - pass
1	`exploited`	Exploited (no tier, or Ingested tier)
2	`exploited_local_action`	Exploited with LocalAction tier
3	`exploited_boundary_breach`	Exploited with BoundaryBreach tier
4	`partial`	Partial exploitation
5	`error`	Evaluation error
10	Runtime error	Infrastructure or engine failure
64	Usage error	Invalid CLI arguments
130	Interrupted	SIGINT received (Ctrl+C)
143	Terminated	SIGTERM received

Global flags​

Commands​

run​

Context-mode flags​

validate​

scenarios list​

scenarios show​

scenarios run​

version​

Environment variables​

Exit codes​

See also​