Configuration
Problem YAML reference, global settings, budget controls, and optimization targets — everything you need to control how NightShift approaches your problem.
Problem YAML
The primary way to configure a run. Pass to nightshift solve problem.yaml.
# problem.yaml — full reference title: SWOT analysis for EV conversion startup # required description: | # required Research the market for converting combustion cars to electric. Produce a comprehensive SWOT analysis with real data and citations. Include competitor landscape and regulatory environment. acceptance: # required — what "done" means - Real market data with citations - Professional document with executive summary - At least 5 identified opportunities and 3 threats - Competitor analysis covering top 3 players optimize_for: quality # quality | speed | cost max_budget_usd: 3.0 # hard cap (default: 1.0) max_attempts: 3 # max retry iterations (default: 3) timeout_minutes: 30 # hard time limit (default: 60) context: # optional — extra context for agents - This is for a Series A pitch deck, needs investor-grade quality - Focus on EU market, especially Germany and Netherlands kb_inject: # optional — manually seed KB for this run - EV market in EU grew 35% in 2023 (source: ACEA) - Key regulatory milestone: ICE ban in EU by 2035 dry_run: false # simulate first, then execute background: false # same as --bg flag
Field reference
| Field | Type | Description |
|---|---|---|
| title | required | Short description of the problem. Used for KB indexing and run identification. |
| description | required | Full problem statement. Be specific — the Coordinator reads this to plan the team. Vague descriptions produce vague plans. |
| acceptance | required | List of criteria that define "done." The Evaluator checks each criterion. If any are unmet, can_improve is set and the system may iterate. |
| optimize_for | optional | quality (default) — prioritize output quality. speed — minimize time. cost — minimize token spend. |
| max_budget_usd | optional | Hard budget cap in USD. Includes all LLM API calls across all agents. Default: 1.0. When reached, system outputs best current result. |
| max_attempts | optional | Maximum improvement iterations. Each attempt is a full pipeline execution. Default: 3. |
| timeout_minutes | optional | Hard wall-clock time limit. Default: 60 minutes. System outputs best result when reached. |
| context | optional | List of strings added to Coordinator's context. Use for constraints, audience, format requirements not captured in description. |
| kb_inject | optional | Pre-seed KB with known facts for this run. Useful when you have domain knowledge the system doesn't have yet. |
| dry_run | optional | If true, Coordinator plans and simulates the pipeline without executing agents. Use to preview the team before spending budget. |
| background | optional | Run in background, same as --bg flag. Status available via nightshift status. |
optimize_for in depth
This setting shapes the Coordinator's planning strategy and the Investor's default risk posture.
quality (default)
optimize_for: quality # Coordinator behavior: # - More agents in pipeline # - Validation steps included # - Sub-evaluators spawned # - More iterations allowed # Investor posture: willing to # spend more for better results
speed
optimize_for: speed # Coordinator behavior: # - Minimal pipeline # - Single-pass preferred # - Skip validation steps # - Use cached patterns # Investor posture: deliver # quickly, accept lower quality
cost
optimize_for: cost # Coordinator behavior: # - Prefer Haiku over Sonnet # - Smaller context windows # - Minimal agents # - No sub-evaluators # Investor posture: deliver # from what we have, stop early
quality + budget cap
optimize_for: quality max_budget_usd: 5.0 # Best of both: system pursues # quality but won't exceed $5. # Recommended for most tasks. # Budget is a safety net, not # the primary constraint.
Global Configuration
Global settings live in ~/.nightshift/config.yaml. These apply to all runs unless overridden by problem YAML.
# ~/.nightshift/config.yaml # Default model tiers models: coordinator: claude-sonnet-4-5 # planner (needs to be smart) worker: claude-sonnet-4-5 # execution agents evaluator: claude-sonnet-4-5 # judge librarian: claude-haiku-4-5 # KB consolidation (cheap) investor: claude-haiku-4-5 # pressure signals (cheap) auditor: claude-haiku-4-5 # anomaly detection (cheap) # Default budget default_budget_usd: 1.0 # KB settings kb: embed_model: modernbert-base # or modernbert-large max_results: 5 # KB query result count consolidate_after_run: true global_kb_path: ~/.nightshift/kb/ # Concurrency max_parallel_agents: 3 # agents that can run simultaneously # Monitoring status_update_interval_sec: 5 events_max_file_size_mb: 10 # rotate events.jsonl after this
Writing good acceptance criteria
Acceptance criteria are the most impactful part of your problem YAML. Vague criteria produce unreliable evaluation. Specific criteria produce reliable evaluation.
Avoid vague criteria: "Good quality analysis" is not a criterion. The Evaluator cannot objectively assess it. This leads to inconsistent scores and unreliable learning.
Good vs. bad criteria examples
| Bad | Good |
|---|---|
| Well-written document | Document has executive summary, body sections, and conclusion. No paragraph exceeds 150 words. |
| Fix the bug | All 47 existing tests pass. The divide-by-zero error is not reproducible with inputs from test_edge_cases.py. |
| Research the market | At least 5 data points with sources. Data is from 2022 or newer. At least 3 sources are primary (not summaries). |
| Comprehensive analysis | SWOT matrix covers exactly 4 sections. Each section has at least 3 items. Each item has a one-line justification. |
Budget controls
NightShift has three cost control mechanisms that work together:
- max_budget_usd — hard cap per run. When reached, the system gracefully stops and outputs the best current result. Never goes over.
- optimize_for: cost — changes the Coordinator's planning posture to minimize tokens at each step.
- max_attempts — caps improvement iterations. Each failed attempt costs tokens; fewer iterations = lower maximum cost.
Cost estimation: A typical quality run costs $0.20–$1.50 depending on problem complexity. Use
nightshift solve problem.yaml --dry-run to see the planned pipeline and estimated cost before committing.
Recommended budget by problem type
| Problem type | Suggested budget | Notes |
|---|---|---|
| Simple bug fix | $0.20 – $0.50 | Few agents, clear acceptance criteria |
| Code feature (< 200 lines) | $0.50 – $1.50 | Research + implementation + testing |
| Research / analysis | $1.00 – $3.00 | Needs multiple sources, sub-evaluators |
| Complex multi-file refactor | $1.50 – $4.00 | Many agents, multiple iterations |
| Full document (report, pitch) | $2.00 – $5.00 | Research + writing + refinement |
Environment variables
# API key (required) ANTHROPIC_API_KEY=sk-ant-... # Override config file location NIGHTSHIFT_CONFIG=/path/to/config.yaml # Override global KB path NIGHTSHIFT_KB_PATH=/path/to/global/kb # Enable debug logging NIGHTSHIFT_DEBUG=1 # Disable KB writes (read-only mode) NIGHTSHIFT_KB_READONLY=1