Checkpoint diff & SAE analysis

Post-training interpretability on real fine-tuned checkpoints. Compare base vs checkpoint with aquin diff weight (per-layer ‖ΔW‖, rank/collapse signals, merge verdict, optional behavioral probes), aquin diff sae (sparse feature activation deltas through the public SAE), and aquin diff residue (per-layer activation drift on probes). Bisect multi-checkpoint training with aquin check trajectory. For activation capture, temp SAE training, and alignment, see SAE training (/docs/sae-training). Works for LLM and embedding models. Requires GPU, aquin load model, and aquin load sae for feature-level diffs. Results upload to your CLI inbox when logged in.

PrerequisiteLLM: aquin login · aquin load model llama-3.2-1b · aquin load sae llama-3.2-1b-l8 · Embedding: aquin login · aquin load model gte-small · aquin load sae gte-small-l11

5 commands

aquin diff weight

agent tool: run_merge_analysis

Per-layer weight delta between the catalog base model and a fine-tuned checkpoint, plus rank/collapse signals, merge verdict (pass | warn | fail), and optional behavioral model-diff on probe generations (LLM only). LLMs report Q/K/V/O/MLP ‖ΔW‖ and stable rank (full weights via TransformerLens, or LoRA effective B@A). Embedding models diff HF encoder weights grouped by layer. Exit code 2 on fail. Tracks a mergeAnalysis card and uploads to your CLI inbox when logged in.

Flag	Description
--checkpoint*	Path to merged .pt checkpoint or HF save_pretrained directory.
--name	Label for checkpoint in output and inbox card (default: filename stem).
--prompts	JSON array or JSONL probes for behavioral diff (LLM).
--no-behavioral	Skip generation-based behavioral scores (faster).
--save	Write schema_version=1 JSON export.

example

Uses the loaded model as base. No --model override.

aquin diff sae

agent tool: run_sae_diff

Load the catalog base model and a fine-tuned checkpoint, run the same prompts through the public SAE, and report per-feature activation deltas (changed count, mean/max |Δ|, top features). LLMs use TransformerLens residuals; embedding models use mean-pooled layer activations. Tracks a saeDiff card and uploads to your CLI inbox when logged in.

Flag	Description
--checkpoint*	Path to merged .pt checkpoint or HF save_pretrained directory.
--prompts	JSON array or JSONL of probe strings / {instruction, response} rows.
--layer	SAE layer (default: from model config).
--sae	Custom SAE weights path instead of pulled public SAE.
--name	Label for checkpoint in output and inbox card (default: checkpoint filename).
--save	Write full JSON payload to disk.

example

Uses the loaded session model. Checkpoint format: { step, state_dict } from run.checkpoint() or fixtures/e2e/scripts/train_lora_e2e.py.

aquin diff residue

agent tool: run_residual_drift

Per-layer activation drift between the catalog base model and a fine-tuned checkpoint on the same probe set. LLMs compare last-token hook_resid_post cosine distance per layer; embedding models compare mean-pooled hidden states per encoder layer. Complements diff weight (weight space) and diff sae (sparse feature space). Tracks a residualDrift card and uploads to your CLI inbox when logged in.

Flag	Description
--checkpoint*	Path to merged .pt checkpoint or HF save_pretrained directory.
--prompts	JSON array or JSONL probe strings (default: built-in short prompts).
--name	Label for checkpoint in output and inbox card (default: filename stem).
--save	Write schema_version=1 JSON export.

example

Uses the loaded model id for catalog base weights. No --model override.

aquin check trajectory

agent tool: run_trajectory_analysis

Training trajectory: diff weight summary for each checkpoint vs base, sorted by training step when present in the .pt file. Use --checkpoints <glob> or --dir to scan a checkpoint folder. Tracks a trajectoryAnalysis card and uploads to your CLI inbox when logged in.

Flag	Description
--checkpoints	Glob of .pt checkpoints (e.g. ~/runs/checkpoints/step_*.pt).
--dir	Recursively scan directory for *.pt files.
--name	Optional prefix for step labels.
--save	Write schema_version=1 JSON export.

example

Provide exactly one of --checkpoints or --dir. Uses loaded model as base.

aquin simulate (saeDiff)

agent tool: run_simulation

At the end of aquin simulate, the pipeline runs an SAE diff between base and the NTK-linearized synthetic checkpoint. Stream logs [simulate] SAE diff: … with nChanged / meanAbsDelta. See Simulation (LLM) for full simulate flags.

example

Synthetic checkpoint. Not the same as diff sae on a real LoRA checkpoint. See /docs/simulation/llm.

Typical workflow

After fine-tuning (your trainer + run.checkpoint(), or the E2E fixture train_lora_e2e.py), capture probes → diff → temp train → align. Activation capture and sae train live under SAE training.

post-training SAE pipeline

CLI inbox cards

Each command is tracked locally and auto-uploads to your CLI inbox when logged in. Attach outputs on aquin.app to view cards in the analyst sidebar:

diff sae: changed features, top deltas, base vs FT table
diff weight: per-layer ‖ΔW‖, rank/collapse signals, merge verdict, behavioral scores (LLM)
diff residue: per-layer cosine distance on activations (last-token resid or pooled hidden)
check trajectory: multi-checkpoint ‖ΔW‖ over training steps
sae train: layer, quick/full, output path
sae align: mean cosine, weakest/strongest decoder matches

vs simulate & watch

	aquin diff sae	aquin simulate	aquin watch
Checkpoint	Real merged .pt from training	Synthetic NTK-linearized weights	No weights. Metrics JSONL only
GPU	Required	Required	Not required
Web card	saeDiff	Simulation + saeDiff in stream	training.watch.*