Inspection: Embedding (non-SAE)
Geometry and activation tools for encoder models: layer drift, isotropy, OOD separation, attention, and token attribution without sparse autoencoder decomposition. Requires embedding mode: load an embedding model first.
10 commands
aquin embed-layer-drift
agent tool: run_embed_layer_drift
Traces the L2 norm of the pooled embedding at each encoder layer. The steepest change marks where representation shift concentrates. Optional ref_text overlays a second curve for comparison.
| Flag | Description |
|---|---|
| --text* | Primary input text. |
| --ref_text | Reference text for overlay comparison. |
aquin embed-layer-analysis
agent tool: run_embed_layer_analysis
Combined pass: layer drift, isotropy, OOD separation, and embedding consistency in one call. Prefer this when you want a full layer health report without running four separate commands.
| Flag | Description |
|---|---|
| --text* | Primary text. |
| --ref_text | Reference for drift overlay. |
| --texts | JSON array for isotropy / space analysis. |
| --in_texts / --ood_texts | In-distribution and OOD sets. |
| --paraphrases | Paraphrases for consistency scoring. |
aquin embed-isotropy
agent tool: run_embed_isotropy
Computes isotropy (how uniformly embeddings occupy directions in space) and spectral entropy per layer. Low isotropy means the space is anisotropic: a few directions dominate, which hurts retrieval diversity.
| Flag | Description |
|---|---|
| --texts* | JSON array of strings to encode. |
aquin embed-ood
agent tool: run_embed_ood
Compares in-distribution vs out-of-distribution text embeddings at each layer. Reports separation score: how well the model distinguishes familiar from unfamiliar input geometry.
| Flag | Description |
|---|---|
| --in_texts* | JSON array of in-domain texts. |
| --ood_texts* | JSON array of OOD texts. |
aquin embed-attention
agent tool: run_embed_attention
Extracts per-head attention matrices across all encoder layers for a single input. Shows which tokens the encoder attends to when building the pooled embedding.
| Flag | Description |
|---|---|
| --text* | Input text. |
aquin embed-matrix
agent tool: run_embed_matrix
Encodes multiple texts and renders their N×N cosine similarity matrix. Useful for sanity-checking whether related sentences cluster and unrelated ones separate.
| Flag | Description |
|---|---|
| --texts* | JSON array of strings. |
aquin embed-space
agent tool: run_embed_space_analysis
Measures anisotropy (variance concentration along principal axes) and intrinsic dimensionality of the embedding cloud. High anisotropy indicates collapse onto a few directions.
| Flag | Description |
|---|---|
| --texts* | JSON array of strings. |
aquin embed-attribution
agent tool: run_embed_attribution
Integrated gradients token attribution on the final embedding. Scores each input token by how much it contributes to the embedding vector, the embedding analogue of LLM token attribution.
| Flag | Description |
|---|---|
| --text* | Input text. |
aquin embed-perturbation
agent tool: run_embed_perturbation
Perturbs individual tokens (mask, swap, delete) and measures cosine shift in the output embedding. Identifies tokens the representation is most sensitive to.
| Flag | Description |
|---|---|
| --text* | Input text. |
aquin embed-retrieval
agent tool: run_embed_retrieval
Encodes a query and corpus, ranks documents by cosine similarity, and reports Recall@k metrics. Tests whether the loaded encoder retrieves the right documents for a query set.
| Flag | Description |
|---|---|
| --query | Single query string. |
| --corpus | JSON array of document strings. |
