Vibe Code LLMs

It's extremely easy, fast, low cost and simple. With complete privacy & safety. Build your own top-tier models with just few clicks, minutes and prompts.
New Tab
Project Discussion
Research Notes
I need to create a training dataset for a general-purpose AI assistant. Something with diverse knowledge across different topics.
Great choice! For a general-purpose model, you'll want diverse question-answer pairs covering everyday knowledge. Start with 200-500 high-quality examples across categories like science, history, practical how-tos, creative tasks, and general facts. What's your target audience?
General users, nothing too technical. Mix of factual questions, creative prompts, and practical advice. Should I use a specific format?
Keep it simple - alternating question/answer format works best. Mix short factual answers with longer explanations. For training, you can export as TXT, JSON, JSONL, or CSV depending on your framework. The model will learn the natural conversational pattern from your examples.
001 | [INFO] Loading Llama-2-7B base model...
002 | [INFO] Applying LoRA config: rank=8, alpha=16, dropout=0.1
003 | [INFO] Dataset loaded: "general-qa-v1" - 412 examples
004 | [WARN] Found 2 examples with token length > 1024, truncating...
005 | [INFO] Hyperparameters: lr=3e-5, batch_size=8, warmup=30 steps
006 | [INFO] Starting training run...
007 | [EPOCH 1/3] Step 25/52 - Loss: 2.341 | LR: 1.5e-5 | grad_norm: 0.68
008 | [EPOCH 1/3] Step 50/52 - Loss: 1.687 | LR: 3e-5 | grad_norm: 0.92
009 | [EPOCH 1/3] Complete - Avg Loss: 1.823 | Time: 3m 18s
010 | [EVAL] Val Loss: 1.645 | Perplexity: 5.18
011 | [EPOCH 2/3] Step 25/52 - Loss: 1.234 | LR: 3e-5 | grad_norm: 0.71
012 | [EPOCH 2/3] Step 50/52 - Loss: 0.956 | LR: 3e-5 | grad_norm: 0.83
013 | [EPOCH 2/3] Complete - Avg Loss: 1.089 | Time: 3m 15s
014 | [EVAL] Val Loss: 1.012 | Perplexity: 2.75
015 | [SUCCESS] Training converged successfully!
016 | [SUCCESS] Checkpoint saved: models/general-qa-lora-epoch2.safetensors
Logo

easily customize all specs

DeviceRTX 4090
VRAM24 GB
RAM32 GB
CPU16 Cores
microsoft/phi-2

use ai to make ai

New Tab
Project Discussion
Research Notes
I need to create a training dataset for a general-purpose AI assistant. Something with diverse knowledge across different topics.
Great choice! For a general-purpose model, you'll want diverse question-answer pairs covering everyday knowledge. Start with 200-500 high-quality examples across categories like science, history, practical how-tos, creative tasks, and general facts. What's your target audience?
General users, nothing too technical. Mix of factual questions, creative prompts, and practical advice. Should I use a specific format?
Keep it simple - alternating question/answer format works best. Mix short factual answers with longer explanations. For training, you can export as TXT, JSON, JSONL, or CSV depending on your framework. The model will learn the natural conversational pattern from your examples.
01 | [INFO] Loading Llama-2-7B base model...
02 | [INFO] Applying LoRA config: rank=8, alpha=16, dropout=0.1
03 | [INFO] Dataset loaded: "general-qa-v1" - 412 examples
04 | [WARN] Found 2 examples with token length > 1024, truncating...
05 | [INFO] Hyperparameters: lr=3e-5, batch_size=8, warmup=30 steps
06 | [INFO] Starting training run...
Logo

editor with rich integrations

2-3 minute training time

001 | [INFO] Loading Llama-2-7B base model...
002 | [INFO] Applying LoRA config: rank=8, alpha=16, dropout=0.1
003 | [INFO] Dataset loaded: "general-qa-v1" - 412 examples
004 | [WARN] Found 2 examples with token length > 1024, truncating...
005 | [INFO] Hyperparameters: lr=3e-5, batch_size=8, warmup=30 steps
006 | [INFO] Starting training run...
007 | [EPOCH 1/3] Step 25/52 - Loss: 2.341 | LR: 1.5e-5 | grad_norm: 0.68
008 | [EPOCH 1/3] Step 50/52 - Loss: 1.687 | LR: 3e-5 | grad_norm: 0.92
009 | [EPOCH 1/3] Complete - Avg Loss: 1.823 | Time: 3m 18s
010 | [EVAL] Val Loss: 1.645 | Perplexity: 5.18
011 | [EPOCH 2/3] Step 25/52 - Loss: 1.234 | LR: 3e-5 | grad_norm: 0.71
012 | [EPOCH 2/3] Step 50/52 - Loss: 0.956 | LR: 3e-5 | grad_norm: 0.83
013 | [EPOCH 2/3] Complete - Avg Loss: 1.089 | Time: 3m 15s
014 | [EVAL] Val Loss: 1.012 | Perplexity: 2.75
015 | [SUCCESS] Training converged successfully!
016 | [SUCCESS] Checkpoint saved: models/general-qa-lora-epoch2.safetensors

no
lagging


advanced ml methods
simplified training

fine-tuning

Update all model weights

W (original)
W' (updated)

LoRA

Low-Rank Adaptation: ΔW = B × A

B (d×r)
A (r×k)
ΔW (d×k)

QLoRA

Quantized + Low-Rank Adaptation

W (4-bit)
Frozen
B
A
Quantize base model to 4-bit, train small adapter matrices

RAG

Retrieval Augmented Generation(works with third-party llms as well)

Knowledge Base
Documents
Query
Retrieved
Context
LLM
Response

train in any format
automatic formatting with ai

JSON

Export your training data as structured JSON arrays with full metadata and nested properties for maximum flexibility.

JSONL

Export as JSON Lines format - one complete JSON object per line, ideal for streaming and processing large datasets efficiently.

CSV

Export as comma-separated values with headers - simple, lightweight, and universally compatible with spreadsheets and databases.

TXT

Export as plain text with custom delimiters - perfect for simple question-answer pairs and human-readable training data.

[
{
"question": "What is the capital of France?",
"answer": "The capital of France is Paris, a major European city and a global center for art, fashion, gastronomy and culture.",
"category": "geography",
"difficulty": "easy",
"source": "world_facts_v2"
},
{
"question": "How does photosynthesis work?",
"answer": "Photosynthesis is the process by which plants use sunlight, water, and carbon dioxide to create glucose and oxygen. This occurs primarily in the chloroplasts of plant cells.",
"category": "science",
"difficulty": "medium",
"source": "biology_basics"
},
{
"question": "Write a haiku about coding",
"answer": "Code flows like water\nDebugging through the night\nCoffee keeps me alive",
"category": "creative",
"difficulty": "medium",
"source": "creative_writing"
},
{
"question": "What is machine learning?",
"answer": "Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. It uses algorithms to parse data, learn from it, and make predictions.",
"category": "technology",
"difficulty": "medium",
"source": "ai_fundamentals"
},
{
"question": "Explain the theory of relativity in simple terms",
"answer": "Einstein's theory of relativity shows that space and time are interwoven and relative to the observer. The faster you move through space, the slower you move through time. Massive objects also bend spacetime, creating what we experience as gravity.",
"category": "science",
"difficulty": "hard",
"source": "physics_explained"
},
{
"question": "What are the main differences between Python and JavaScript?",
"answer": "Python is primarily used for backend development, data science, and automation, with indentation-based syntax. JavaScript is mainly for web development, running in browsers, with curly-brace syntax. Python is interpreted and strongly typed, while JavaScript is also interpreted but weakly typed.",
"category": "programming",
"difficulty": "medium",
"source": "coding_comparisons"
},
{
"question": "Describe the water cycle",
"answer": "The water cycle is the continuous movement of water on, above, and below Earth's surface. It includes evaporation, condensation, precipitation, and collection. Water evaporates from surfaces, forms clouds, falls as rain or snow, and collects in bodies of water.",
"category": "science",
"difficulty": "easy",
"source": "earth_science"
},
{
"question": "What is blockchain technology?",
"answer": "Blockchain is a distributed ledger technology that records transactions across multiple computers in a way that makes them nearly impossible to alter retroactively. Each block contains transaction data, a timestamp, and is cryptographically linked to the previous block.",
"category": "technology",
"difficulty": "hard",
"source": "fintech_guide"
}
]

Strong memory, context and dataset handling

Hugging Face

Hugging Face

Browse, download, use and train any model from Hugging Face with datasets using fine-tuning, LoRa and QLoRa or use RAG system.

Hugging Face mockup
Anthropic

Anthropic

Select provided claude models. Use datasets with a strong RAG system for long term and large memory and context.

Anthropic mockup
Ollama

Ollama

Connect and use any downloaded model from Ollama. Use datasets with a strong RAG system for long term and large memory and context.

Ollama mockup

Type datasets with rich integrations. Integrate Google services, files, URLs, browser, audio and screen context.

Upload Any File
Upload sheets, docs, pdf, images, code or any other file type and even zips.
pitch.pdf
picnic.png
math.zip
presentation.pptx
budget.xlsx
notes.txt
README.md
app.js
styles.css
main.py
demo.mp4
contract.docx
Browser Tabs & URLs
Attach opened browser tabs, URLs and YouTube links as context.
Tabs icon
aquin.app favicon
aquin.app
github.com favicon
github.com
claude.ai favicon
claude.ai
youtube.com favicon
youtube.com
Record Audio
Record your mic and your system audio for speech-to-text.

System, Browser & Google Connections

File Manager MCP

Completely have Aquin manage paths and use them as context for datasets or RAG.

File Manager MCP mockup

Clipboard

Use your clipboard history from your system as context for datasets or RAG.

Clipboard mockup

Browser History

Use your browsing context from your connected browser for datasets or RAG.

Browser History mockup
Gmail

Gmail

Use your mails from your connected Gmail as context for datasets or RAG.

Gmail mockup
Google Calendar

Google Calendar

Use your events from your connected calendar as context for datasets or RAG.

Google Calendar mockup
Google Drive

Google Drive

Use imported files from your connected drive as context for datasets or RAG.

Google Drive mockup

Not sure if Aquin is right for you?