Integrate with CI/CD

ThoughtJack can run in CI/CD pipelines to validate configurations and test agent implementations against attack scenarios.

Validate configs in CI

Add config validation to your pipeline to catch errors before deployment:

.github/workflows/validate.yml
name: Validate ThoughtJack Configs
on: [push, pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install ThoughtJack
        run: |
          curl --proto '=https' --tlsv1.2 -LsSf \
            https://github.com/thoughtgate/thoughtjack/releases/latest/download/thoughtjack-installer.sh | sh

      - name: Validate all configs
        run: |
          for f in scenarios/*.yaml; do
            thoughtjack validate "$f"
          done

Machine-readable output

Use --format json for structured output from listing commands:

# JSON scenario listing
thoughtjack scenarios list --format json

JSON log format

When running scenarios in CI, use JSON logs for machine parsing:

thoughtjack run my_config.yaml --log-format json -vv

Each log line is a JSON object:

{"timestamp":"2025-01-15T10:30:00Z","level":"INFO","target":"thoughtjack::engine","message":"Phase transition","phase":"exploit"}

Event file output

Write structured events to a JSONL file for post-run analysis:

thoughtjack run my_config.yaml \
  --events-file events.jsonl \
  --log-format json

After the run, parse events.jsonl:

# Count phase transitions
jq 'select(.type == "PhaseEntered")' events.jsonl | wc -l

# Get verdict
jq 'select(.type == "VerdictComputed")' events.jsonl

Exit codes

ThoughtJack uses verdict-based exit codes that CI pipelines can check. Exit codes 1–3 indicate exploitation at increasing severity tiers:

Code	Name	CI Interpretation
0	`not_exploited`	Pass
1	`exploited`	Fail (no tier, or Ingested)
2	`exploited_local_action`	Fail (LocalAction tier)
3	`exploited_boundary_breach`	Fail (BoundaryBreach tier)
4	`partial`	Warning
5	`error`	Unstable
10	Runtime error	Infra failure
64	Usage error	Invalid args
130	Interrupted	SIGINT
143	Terminated	SIGTERM

# Context mode (self-contained, no agent needed):
- name: Run scenario
  run: |
    thoughtjack run test.yaml \
      --context --context-provider openai --context-model gpt-4o \
      --context-api-key ${{ secrets.OPENAI_API_KEY }} \
      -o verdict.json
  continue-on-error: false

# Traffic mode requires a running agent as a service in the CI job.
# See the "Test Agent Frameworks" guide for setup.

Context mode in CI

Context mode is self-contained - no agent infrastructure needed. Set API credentials via environment variables:

.github/workflows/context-test.yml
- name: Run context-mode scenario
  env:
    THOUGHTJACK_CONTEXT_API_KEY: ${{ secrets.OPENAI_API_KEY }}
    THOUGHTJACK_CONTEXT_MODEL: gpt-4o
  run: |
    thoughtjack scenarios run oatf-001 \
      --context \
      -o verdict.json \
      --export-trace trace.jsonl

Use --max-turns to control LLM API cost per run. Use --context-timeout to set a per-request deadline that matches your CI timeout budget.

Prometheus metrics

For long-running test sessions, expose metrics:

thoughtjack run my_config.yaml \
  --mcp-server 127.0.0.1:8080 \
  --metrics-port 9090

Scrape http://localhost:9090/metrics for request counts, phase transitions, delivery durations, and error rates.

Validate configs in CI​

Machine-readable output​

JSON log format​

Event file output​

Exit codes​

Context mode in CI​

Prometheus metrics​