Integrate with CI/CD
ThoughtJack can run in CI/CD pipelines to validate configurations and test agent implementations against attack scenarios.
Validate configs in CI
Add config validation to your pipeline to catch errors before deployment:
name: Validate ThoughtJack Configs
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install ThoughtJack
run: |
curl --proto '=https' --tlsv1.2 -LsSf \
https://github.com/thoughtgate/thoughtjack/releases/latest/download/thoughtjack-installer.sh | sh
- name: Validate all configs
run: |
for f in scenarios/*.yaml; do
thoughtjack validate "$f"
done
Machine-readable output
Use --format json for structured output from listing commands:
# JSON scenario listing
thoughtjack scenarios list --format json
JSON log format
When running scenarios in CI, use JSON logs for machine parsing:
thoughtjack run my_config.yaml --log-format json -vv
Each log line is a JSON object:
{"timestamp":"2025-01-15T10:30:00Z","level":"INFO","target":"thoughtjack::engine","message":"Phase transition","phase":"exploit"}
Event file output
Write structured events to a JSONL file for post-run analysis:
thoughtjack run my_config.yaml \
--events-file events.jsonl \
--log-format json
After the run, parse events.jsonl:
# Count phase transitions
jq 'select(.type == "PhaseEntered")' events.jsonl | wc -l
# Get verdict
jq 'select(.type == "VerdictComputed")' events.jsonl
Exit codes
ThoughtJack uses verdict-based exit codes that CI pipelines can check. Exit codes 1–3 indicate exploitation at increasing severity tiers:
| Code | Name | CI Interpretation |
|---|---|---|
| 0 | not_exploited | Pass |
| 1 | exploited | Fail (no tier, or Ingested) |
| 2 | exploited_local_action | Fail (LocalAction tier) |
| 3 | exploited_boundary_breach | Fail (BoundaryBreach tier) |
| 4 | partial | Warning |
| 5 | error | Unstable |
| 10 | Runtime error | Infra failure |
| 64 | Usage error | Invalid args |
| 130 | Interrupted | SIGINT |
| 143 | Terminated | SIGTERM |
# Context mode (self-contained, no agent needed):
- name: Run scenario
run: |
thoughtjack run test.yaml \
--context --context-provider openai --context-model gpt-4o \
--context-api-key ${{ secrets.OPENAI_API_KEY }} \
-o verdict.json
continue-on-error: false
# Traffic mode requires a running agent as a service in the CI job.
# See the "Test Agent Frameworks" guide for setup.
Context mode in CI
Context mode is self-contained - no agent infrastructure needed. Set API credentials via environment variables:
- name: Run context-mode scenario
env:
THOUGHTJACK_CONTEXT_API_KEY: ${{ secrets.OPENAI_API_KEY }}
THOUGHTJACK_CONTEXT_MODEL: gpt-4o
run: |
thoughtjack scenarios run oatf-001 \
--context \
-o verdict.json \
--export-trace trace.jsonl
Use --max-turns to control LLM API cost per run. Use --context-timeout to set a per-request deadline that matches your CI timeout budget.
Prometheus metrics
For long-running test sessions, expose metrics:
thoughtjack run my_config.yaml \
--mcp-server 127.0.0.1:8080 \
--metrics-port 9090
Scrape http://localhost:9090/metrics for request counts, phase transitions, delivery durations, and error rates.