Definition

Fine-tuning

Fine-tuning adapts a pre-trained LLM to a specific task or domain by continuing training on a smaller, targeted dataset.

Fine-tuning is the process of continuing training a pre-trained LLM on a smaller, focused dataset to specialize it for a task or domain. The base model already "knows" language from pretraining; fine-tuning teaches it to answer in a specific format, adopt a voice, or handle a narrow workload (like your company's ticketing system) better than generic prompting could.

Why it matters

Before fine-tuning, strong prompt engineering and RAG are cheaper, faster to iterate on, and don't lock you to a frozen model checkpoint. Most teams should exhaust those first. Fine-tuning wins when:

The task has a consistent format you can't reliably prompt into
Quality matters more than flexibility
You need to shave tokens from each call by not re-specifying the task
You're running a smaller self-hosted model and need it to behave

For agentic coding, most users never fine-tune — they use off-the-shelf Claude, GPT, or Qwen through CLIs like Claude Code, Codex CLI, and Qwen Code. Fine-tuning is usually the domain of platform teams building a specialized AI product.

How it works

Fine-tuning variants from most to least expensive:

Full fine-tuning — update all model weights. Needs large GPUs and a lot of data. Rare outside labs.
LoRA / QLoRA — train small low-rank adapters on top of frozen base weights. Fast, cheap, runs on consumer hardware.
Prefix / prompt tuning — learn soft prompts rather than touching weights. Even lighter.
Supervised fine-tuning (SFT) — standard next-token loss on input/output pairs.
RLHF / DPO / ORPO — preference-based training with comparison pairs.

After training, the adapter or full checkpoint is deployed. You call it like any other model.

How it's used

Typical fine-tuning projects:

Domain-specific copilot — medical, legal, scientific text
Structured output — consistent JSON shape without heavy schema prompts
Style matching — corporate voice, documentation tone
Small-model specialization — LoRA a 7B model into something useful for one narrow task

Most coding CLIs don't need fine-tuning — frontier base models already handle code well. If your team runs a private model, fine-tuning on internal code can be worthwhile.

LLM — the thing being tuned
RAG — the cheaper-first alternative
System prompt — another cheaper-first alternative
Prompt engineering — always try this first
Embedding — unrelated training, similar tooling

Fine-tuning

Why it matters

How it works

How it's used

FAQ

Should I fine-tune for better code completion?

How much data do I need?

Related terms