designingintelligencedesigning intelligence
Full-stack AI observability with tracing training data provenance, inspecting model weights to find where specific behaviors and knowledge are stored, and editing them directly without fine-tuning or retraining.
attribution
Trace every response token back to the prompt tokens that caused it. See exactly how signal flows through layers to produce each word.
Causal graph of how signal flows through layers. Thicker edges carry more weight. See which layers matter for any output.
edge weight = causal signallogit lens
See what the model thinks at every layer as it builds toward a final answer. Watch a vague token sharpen into a confident prediction.
diff
Connect any two checkpoints and see exactly what changed, which weights shifted, which dataset caused it, and whether the data behind it is clean.
Every dataset that contributed to the fine-tune, with license, jurisdiction, and status. Flagged sources link to the weights they affected.
data provenance
Inspect the full training data record. Every source, under what license, from what jurisdiction, and whether synthetic data or translations are in the chain.
| source | license | jurisdiction | opt-out | synthetic | status |
|---|---|---|---|---|---|
| wikipedia_en_2023.parquet | CC BY-SA 4.0 | Global | no | no | clean |
| reddit_comments_filtered.jsonl | unknown | US | partial | no | review |
| gpt4_synthetic_qa.jsonl | OpenAI ToS | US | n/a | yes | flagged |
| pubmed_abstracts_2022.csv | NLM ToS | US | no | no | clean |
| translated_pile_fr.jsonl | derived | EU | unknown | no | flagged |
One flagged source propagates liability through every derived dataset. Paraphrases, translations, and synthetic augmentations all inherit the risk of their origin.
benchmarks
Three suites built into Aquin. Run them on any checkpoint, edit, or quantization pass. Know immediately whether a change made the model better or worse.
EditBench
edit fidelitySurgical precision. Does the edit change only what you intended?
FineTuneDiff
checkpoint diffWhat actually changed between base and fine-tuned at the weight level.
InterpScore
interpretabilityHow cleanly do features map to human-readable concepts?
| run | EditBench | FineTuneDiff | InterpScore | delta |
|---|---|---|---|---|
| llama-3.2-1b · base | 71 | 64 | 59 | baseline |
| llama-3.2-1b · sft-v1 | 78 | 79 | 63 | +9 avg |
| llama-3.2-1b · sft-v2 | 82 | 83 | 70 | +5 avg |
| llama-3.2-1b · int4-quant | 74 | 71 | 61 | -9 avg |
| llama-3.2-1b · rome-edit-1 | 94 | 88 | 73 | +14 avg |
human readability
Model internals are not inherently unreadable. Aquin translates activations, weights, and layer states into language an engineer can reason about.
| weight | raw | label |
|---|---|---|
| L14 · MLP W_out [2048,11] | 0.847 | capital city associations |
| L8 · attn head 3 · V | -0.312 | geographic suppression |
| L12 · MLP W_in [512,2048] | 0.601 | factual recall trigger |
| L6 · attn head 7 · Q | 0.229 | question parsing |
factual checks
Most models ship as black boxes. You have no way to know what they learned to suppress, amplify, or distort. Aquin surfaces it.
Trace which features consistently skew outputs along political, demographic, or cultural lines. See the weight, not just the symptom.
Find what the model refuses to say and why. Identify suppression circuits. See whether refusals are weight-level decisions or surface-level RLHF patches.
aipedia
A living, community-indexed knowledge base of model features. Every behaviour, every circuit, every weight pattern. Searchable. Citable. Growing.
| feature | model | layer | circuit | confidence |
|---|---|---|---|---|
| capital city recall | Llama 3.2 1B | L14 | MLP W_out [2048,11] | 94% |
| hedging language | Llama 3.2 1B | L8 | attn head 3 · V | 87% |
| geographic association | Mistral 7B | L11 | MLP W_in [512,2048] | 81% |
| refusal circuit | Gemma 2B | L9 | attn head 7 · Q | 76% |
weight editing
Locate the exact MLP layer encoding a fact. Overwrite it with a rank-one update. Validate with three independent checks. No retraining needed.
L12 carries 90.4% of causal recovery signal. red rings = above 40% threshold.
Not sure if Aquin is right for you?
Aquin
