Skip to content
Prompt engineering

Prompt engineering · · 7 min read

7 prompt patterns for generating code

Spec-first, few-shot, test-first and four more patterns, each with an example code prompt.

The quality of code from a model is not a function of how good the model is. It is a function of how well you frame the problem. The same Claude, GPT or Cursor will spit out mediocre boilerplate or exactly what you need — the difference lives in the three to four sentences at the start. A prompt is not an incantation, it is a contract specification. Every gap you leave, the model fills in for you, and it always fills toward the “most typical” solution from its training, not toward your repository.

Below are seven patterns I use daily. Each has a rationale (why it works) and a short example prompt. They are not mutually exclusive — the best results come from combining them. Written from the perspective of a developer who treats the model like a very fast, very literal junior.

Pattern 1: spec-first — the spec before the code

Before you ask for an implementation, ask for a plan. Have the model describe function signatures, edge cases, the input/output contract, and only write code after you sign off. It works because a model that plans separately does not blend design decisions with syntax details — and that seam is exactly where most bugs are born. You also get a checkpoint: a cheap-to-fix plan instead of expensive-to-rewrite code.

Example: Before writing any code, describe in bullet points: the function signature, input and output types, 3 edge cases and the error-handling approach. Then stop and wait for my OK.

In practice the most common miss is skipping the “what we are NOT doing” section. State the negative scope explicitly — without it the model throws in caching, retries and logging you never asked for.

Pattern 2: few-shot on your repo conventions

The model does not know your style until you show it. Paste 1–2 existing functions from your code and say: “follow this convention exactly”. It works because the model is excellent at imitating an in-context pattern — far better at “do it like this” than at “follow our styleguide documented somewhere else”. Naming, error returns, import layout, tests — it picks all of that up from the example.

Example: Here is how we write handlers in this project [paste a sample handler]. Write the createInvoice handler in the identical style: same Zod validation pattern, same Result structure, same naming conventions.

Two good examples beat ten sentences of description. If you have an exemplary file in the repo — that is your best prompt.

Pattern 3: test-first — failing tests before the code

Reverse the order: ask first for tests describing the desired behaviour, and only then for the implementation that satisfies them. It works because tests are an unambiguous contract — the model cannot “almost” satisfy an assertion. You also get verifiability: after the code is generated you simply run the tests instead of reading it line by line.

Example: First write a test suite for parsePolishDate, including malformed formats and leap years. The tests must start red. Show them to me, and after I approve, write the implementation that passes all of them.

The trap: the model likes to “fix” the test rather than the code when the implementation fails. Add: “do not change the tests without my approval”.

Pattern 4: explicit constraints — stack, no new deps, file scope

State plainly what the model may NOT do. Language version, framework, “no new npm packages”, “edit only these two files”. It works because by default the model reaches for the most popular libraries from its training — it will add lodash,axios or moment even though you have native fetch and your own utils. Constraints narrow the solution space to the one that fits your repo.

Example: Constraints: TypeScript strict, React Server Components, no new dependencies (use native fetch), change only the file api/users.ts. Do not touch any config.

File scope is not a detail. Without it the model “fixes things in passing” in neighbouring modules and you get a half-screen diff instead of a five-line one.

Pattern 5: role + output format — “return a unified diff only”

Set a role and force an exact response format. “You are a reviewer”, “return a diff only”, “no prose outside the code”. It works two ways: the role sets a perspective (a reviewer hunts for bugs, an architect thinks about boundaries), and a rigid output format eliminates the wall of prose you would otherwise have to dig through. A diff you apply straight via git apply; a full file you would have to diff by hand.

Example: Act as a senior reviewer. Return ONLY a unified diff (git format), no explanations, no fenced block. If no change is needed, return an empty diff.

“Diff only” has one more side effect: the model rewrites the whole file less often, because it has to show the minimal change. You get smaller, more reviewable PRs.

Pattern 6: chain-of-thought for tricky logic — “reason first, then implement”

For a non-trivial algorithm (concurrency, time zones, custom parsing) have the model lay out its reasoning step by step before writing code. It works because the model generates token by token — if it first “thinks through” the problem in prose, the subsequent code tokens build on correct reasoning rather than on the first association. For simple CRUD it is overkill; for logic loaded with edge cases it makes a real difference.

Example: This is tricky. First walk through step by step how to handle the summer/winter clock change and zones without DST, list the pitfalls, and ONLY THEN write the implementation.

Bonus: when the reasoning is explicit, it is easy to point at the wrong step and fix it, instead of debugging finished, silent code.

Pattern 7: iterative refinement — “critique your own code”

After the first version do not write “fix it”. Write: “review your own code like a harsh reviewer, list the 3 weakest spots, then fix them”. It works because the model in critic mode evaluates differently than in generation mode — it sees missing error handling, a resource leak, a race condition it missed while it was “just writing”. It is a cheap second pass over the same context.

Example: Put on the hat of a demanding reviewer and find problems in your own code: security, edge cases, performance. List them, then return the corrected version.

Cap the iterations at two or three. After the third round the model spins in place more often than it actually improves — at that point it is better to go back to Pattern 1 and rewrite the spec.

Anti-patterns to avoid

Three most common sins. First: “build me an app” with no context — you get the most generic skeleton in existence. Second:dumping the whole repo into the context hoping “the model will figure it out” — the relevant signal drowns in noise and quality drops. Third:accepting code you do not understand — if you cannot review the output, no prompt will save you. A prompt speeds up writing; it does not exempt you from thinking.

TL;DR

Seven patterns: spec-first (plan before code), few-shot on your conventions, test-first (red tests, then code), explicit constraints (stack, no new deps, file scope), role + output format (diff only), chain-of-thought for tricky logic, iterative self-critique. Combine them. Prompt quality dominates model quality — the same Claude or GPT will hand you boilerplate or exactly your code, depending on how you frame the contract.

7 prompt patterns for generating code | vibecoding.pl