Configuration

Problem YAML reference, global settings, budget controls, and optimization targets — everything you need to control how NightShift approaches your problem.

Problem YAML

The primary way to configure a run. Pass to nightshift solve problem.yaml.

# problem.yaml — full reference

title: SWOT analysis for EV conversion startup   # required

description: |                                         # required
  Research the market for converting combustion cars to electric.
  Produce a comprehensive SWOT analysis with real data and citations.
  Include competitor landscape and regulatory environment.

acceptance:                                          # required — what "done" means
  - Real market data with citations
  - Professional document with executive summary
  - At least 5 identified opportunities and 3 threats
  - Competitor analysis covering top 3 players

optimize_for: quality                             # quality | speed | cost
max_budget_usd: 3.0                                 # hard cap (default: 1.0)
max_attempts: 3                                      # max retry iterations (default: 3)
timeout_minutes: 30                                  # hard time limit (default: 60)

context:                                             # optional — extra context for agents
  - This is for a Series A pitch deck, needs investor-grade quality
  - Focus on EU market, especially Germany and Netherlands

kb_inject:                                           # optional — manually seed KB for this run
  - EV market in EU grew 35% in 2023 (source: ACEA)
  - Key regulatory milestone: ICE ban in EU by 2035

dry_run: false                                       # simulate first, then execute
background: false                                    # same as --bg flag

Field reference

Field Type Description
title required Short description of the problem. Used for KB indexing and run identification.
description required Full problem statement. Be specific — the Coordinator reads this to plan the team. Vague descriptions produce vague plans.
acceptance required List of criteria that define "done." The Evaluator checks each criterion. If any are unmet, can_improve is set and the system may iterate.
optimize_for optional quality (default) — prioritize output quality. speed — minimize time. cost — minimize token spend.
max_budget_usd optional Hard budget cap in USD. Includes all LLM API calls across all agents. Default: 1.0. When reached, system outputs best current result.
max_attempts optional Maximum improvement iterations. Each attempt is a full pipeline execution. Default: 3.
timeout_minutes optional Hard wall-clock time limit. Default: 60 minutes. System outputs best result when reached.
context optional List of strings added to Coordinator's context. Use for constraints, audience, format requirements not captured in description.
kb_inject optional Pre-seed KB with known facts for this run. Useful when you have domain knowledge the system doesn't have yet.
dry_run optional If true, Coordinator plans and simulates the pipeline without executing agents. Use to preview the team before spending budget.
background optional Run in background, same as --bg flag. Status available via nightshift status.

optimize_for in depth

This setting shapes the Coordinator's planning strategy and the Investor's default risk posture.

quality (default)
optimize_for: quality

# Coordinator behavior:
# - More agents in pipeline
# - Validation steps included
# - Sub-evaluators spawned
# - More iterations allowed
# Investor posture: willing to
# spend more for better results
speed
optimize_for: speed

# Coordinator behavior:
# - Minimal pipeline
# - Single-pass preferred
# - Skip validation steps
# - Use cached patterns
# Investor posture: deliver
# quickly, accept lower quality
cost
optimize_for: cost

# Coordinator behavior:
# - Prefer Haiku over Sonnet
# - Smaller context windows
# - Minimal agents
# - No sub-evaluators
# Investor posture: deliver
# from what we have, stop early
quality + budget cap
optimize_for: quality
max_budget_usd: 5.0

# Best of both: system pursues
# quality but won't exceed $5.
# Recommended for most tasks.
# Budget is a safety net, not
# the primary constraint.

Global Configuration

Global settings live in ~/.nightshift/config.yaml. These apply to all runs unless overridden by problem YAML.

# ~/.nightshift/config.yaml

# Default model tiers
models:
  coordinator: claude-sonnet-4-5    # planner (needs to be smart)
  worker: claude-sonnet-4-5         # execution agents
  evaluator: claude-sonnet-4-5      # judge
  librarian: claude-haiku-4-5       # KB consolidation (cheap)
  investor: claude-haiku-4-5        # pressure signals (cheap)
  auditor: claude-haiku-4-5         # anomaly detection (cheap)

# Default budget
default_budget_usd: 1.0

# KB settings
kb:
  embed_model: modernbert-base      # or modernbert-large
  max_results: 5                     # KB query result count
  consolidate_after_run: true
  global_kb_path: ~/.nightshift/kb/

# Concurrency
max_parallel_agents: 3              # agents that can run simultaneously

# Monitoring
status_update_interval_sec: 5
events_max_file_size_mb: 10         # rotate events.jsonl after this

Writing good acceptance criteria

Acceptance criteria are the most impactful part of your problem YAML. Vague criteria produce unreliable evaluation. Specific criteria produce reliable evaluation.

Avoid vague criteria: "Good quality analysis" is not a criterion. The Evaluator cannot objectively assess it. This leads to inconsistent scores and unreliable learning.

Good vs. bad criteria examples

BadGood
Well-written document Document has executive summary, body sections, and conclusion. No paragraph exceeds 150 words.
Fix the bug All 47 existing tests pass. The divide-by-zero error is not reproducible with inputs from test_edge_cases.py.
Research the market At least 5 data points with sources. Data is from 2022 or newer. At least 3 sources are primary (not summaries).
Comprehensive analysis SWOT matrix covers exactly 4 sections. Each section has at least 3 items. Each item has a one-line justification.

Budget controls

NightShift has three cost control mechanisms that work together:

  1. max_budget_usd — hard cap per run. When reached, the system gracefully stops and outputs the best current result. Never goes over.
  2. optimize_for: cost — changes the Coordinator's planning posture to minimize tokens at each step.
  3. max_attempts — caps improvement iterations. Each failed attempt costs tokens; fewer iterations = lower maximum cost.
Cost estimation: A typical quality run costs $0.20–$1.50 depending on problem complexity. Use nightshift solve problem.yaml --dry-run to see the planned pipeline and estimated cost before committing.

Recommended budget by problem type

Problem typeSuggested budgetNotes
Simple bug fix$0.20 – $0.50Few agents, clear acceptance criteria
Code feature (< 200 lines)$0.50 – $1.50Research + implementation + testing
Research / analysis$1.00 – $3.00Needs multiple sources, sub-evaluators
Complex multi-file refactor$1.50 – $4.00Many agents, multiple iterations
Full document (report, pitch)$2.00 – $5.00Research + writing + refinement

Environment variables

# API key (required)
ANTHROPIC_API_KEY=sk-ant-...

# Override config file location
NIGHTSHIFT_CONFIG=/path/to/config.yaml

# Override global KB path
NIGHTSHIFT_KB_PATH=/path/to/global/kb

# Enable debug logging
NIGHTSHIFT_DEBUG=1

# Disable KB writes (read-only mode)
NIGHTSHIFT_KB_READONLY=1