Prompting LLMs for Code: Proven Patterns for Unit Tests and Refactors

Stop guessing what your Large Language Model needs to write clean code. You know the feeling: you ask an AI to "write a function that sorts this list," and it gives you something that looks right but crashes in production or fails your specific edge cases. The problem isn't usually the model's intelligence; it's your prompt's lack of precision.

In 2026, we have moved past the era of casual chat-based coding. The industry standard is now structured, high-fidelity prompting that treats the AI not as a creative partner, but as a precise compiler. Research from recent years has identified specific prompt patterns that reduce the number of iterations required to generate production-ready code by up to 70%. By shifting from vague requests to structured inputs like Context and Instruction patterns and Recipe-style prompts that define pre-conditions, post-conditions, and concrete examples, developers can drastically cut down on debugging time and hallucination errors.

The Shift from Chat to Specification

Early adopters of AI coding assistants relied heavily on conversational back-and-forth. You’d ask a question, get a partial answer, correct it, ask again, and repeat. This iterative approach is slow, expensive in terms of token usage, and prone to context drift. Modern effective prompting borrows from Test-Driven Development (TDD) principles, where the expected outcome is defined before the implementation begins.

Instead of saying, "Refactor this code to be faster," you provide a specification. This means defining the input types, the expected output format, and the constraints. For example, if you are generating a Python function, your prompt should explicitly state:

  • Input: A list of integers, potentially containing duplicates.
  • Output: A sorted list with unique values only.
  • Constraint: Must use O(n log n) time complexity.
  • Edge Case: Handle empty lists gracefully without throwing exceptions.

This specificity forces the model to align its generation process with your exact requirements, rather than relying on its statistical best guess. Studies using benchmarks like BigCodeBench show that prompts including these explicit specifications result in significantly higher pass rates for automated unit tests compared to natural language descriptions alone.

Pattern 1: Context and Instruction for Refactoring

When refactoring existing code, clarity is king. The most effective pattern here is the "Context and Instruction" framework. This involves providing the AI with three distinct blocks of information:

  1. The Context: What does this code do? Where does it live in the application? Who calls it?
  2. The Current Code: The actual snippet you want to change.
  3. The Instruction: Specific, actionable directives for the refactor.

Consider a scenario where you have a messy JavaScript utility function. A weak prompt would be: "Clean this up." A strong Context and Instruction prompt looks like this:

// CONTEXT: This function runs in a browser environment and is called every time a user scrolls.
// It checks if an element is visible. Performance is critical here.

// CURRENT CODE:
function checkVisibility(element) {
  const rect = element.getBoundingClientRect();
  return rect.top >= 0 && rect.left >= 0 && rect.bottom <= (window.innerHeight || document.documentElement.clientHeight) && rect.right <= (window.innerWidth || document.documentElement.clientWidth);
}

// INSTRUCTION: 
// 1. Rename the function to 'isElementInViewport'.
// 2. Use modern ES6+ syntax.
// 3. Add JSDoc comments explaining the parameters and return value.
// 4. Ensure no global variables are accessed directly; assume 'window' is passed as an optional second argument.

By separating context from instruction, you prevent the model from making assumptions about your architecture. You also guide it toward specific stylistic choices (ES6+, JSDoc) that match your team’s standards. This pattern reduces the need for follow-up corrections because the AI understands not just *what* to change, but *why* and *how* it fits into the larger system.

Pattern 2: Recipe Prompts for Unit Test Generation

Generating unit tests is one of the highest-value uses of LLMs, but also one where models frequently fail if not guided correctly. The "Recipe" pattern works best here. Think of it as giving the AI a cooking recipe: ingredients (inputs), steps (logic), and the final dish (expected output).

A Recipe prompt for unit tests includes:

  • Pre-conditions: What state must the system be in before the test runs?
  • Action: Which function is being tested, and with what arguments?
  • Post-conditions: What assertions must pass? What side effects are expected?
  • Examples: Provide one working example of a test case in your preferred framework (e.g., Jest, Pytest, JUnit).

For instance, if you are testing a payment processing module in Java, your prompt might look like this:

// TASK: Generate unit tests for the 'processPayment' method.

// PRE-CONDITIONS:
// - The 'PaymentGateway' mock must be initialized.
// - The 'UserAccount' object must have sufficient balance.

// ACTION:
// Call 'paymentService.processPayment(user, amount, currency)'.

// POST-CONDITIONS:
// - Return type should be 'TransactionResult'.
// - If successful, 'TransactionResult.status' must be 'COMPLETED'.
// - If insufficient funds, throw 'InsufficientFundsException'.

// EXAMPLE TEST CASE:
@Test
void testSuccessfulPayment() {
    // Arrange
    User user = new User("123", 100.0);
    double amount = 50.0;
    // Act
    TransactionResult result = service.processPayment(user, amount, "USD");
    // Assert
    assertEquals(COMPLETED, result.getStatus());
}

This structure eliminates ambiguity. The AI doesn’t have to guess which assertion library you prefer or how you handle exceptions. It simply follows the recipe. Research indicates that prompts following this Recipe structure produce test suites that require fewer manual adjustments than those generated from open-ended requests.

Superhero programmer using structured code specifications in vintage comic book art

Defining Pre-Conditions and Post-Conditions Explicitly

One of the biggest gaps in typical AI prompts is the omission of boundary conditions. Developers often assume the AI knows what "normal" behavior looks like, but LLMs operate on probabilities, not logic. To bridge this gap, you must explicitly define pre-conditions and post-conditions in your prompts.

Pre-conditions describe the valid state of the system before the code executes. This includes data types, nullability, and range limits. For example, instead of saying "handle user input," specify "input is a non-empty string with maximum length 255 characters." Post-conditions describe the guaranteed state after execution. This includes return values, database changes, API calls made, and error states. For example, "the function must return a boolean true if the record was saved, false otherwise, and never throw an exception for validation errors." Including these details in your prompt acts as a contract. When the AI generates code, it internally checks against this contract. If the generated code violates a pre-condition (e.g., accessing a null object), the model is more likely to self-correct during generation or produce code that handles the null case explicitly.

Reducing Hallucinations with Concrete Examples

Hallucinations in code generation often occur when the model tries to fill in missing information. Providing concrete examples within your prompt-often referred to as Few-Shot Prompting-anchors the model’s output to reality.

If you are asking for a SQL query, don’t just describe the tables. Include a sample row of data and the expected result. If you are asking for a regex pattern, provide three strings that should match and three that should not. This technique leverages the model’s ability to recognize patterns rather than inventing them.

For example, when asking for a Python script to parse logs, include:

  • Sample Input: "[2026-05-28 10:00:00] ERROR: Connection timeout"
  • Expected Output: {"timestamp": "2026-05-28 10:00:00", "level": "ERROR", "message": "Connection timeout"}

This single example teaches the model the exact JSON structure, key names, and parsing logic you expect. It transforms an abstract request into a concrete mapping task, significantly reducing the likelihood of incorrect field names or malformed outputs.

Developer creating unit tests like a recipe to fight hallucinations in comic style

Security Considerations in Prompt Design

As AI-assisted development becomes standard, security cannot be an afterthought. Your prompts should explicitly instruct the model to adhere to secure coding practices. This means adding a "Security Constraints" section to your prompt templates.

Common instructions include:

  • "Never hardcode secrets or API keys."
  • "Use parameterized queries to prevent SQL injection."
  • "Validate all user inputs before processing."
  • "Avoid using deprecated cryptographic libraries."

While models are trained on secure code, they will still generate vulnerable snippets if prompted casually. By embedding security rules directly into the prompt, you create a first line of defense. Additionally, always review AI-generated code for logical flaws, as models may miss subtle business logic vulnerabilities that aren’t covered in their training data.

Implementing a Prompt Improvement Workflow

Effective prompting is not a one-time skill; it’s an iterative practice. Adopt a workflow similar to code review for your prompts:

  1. Draft: Write your initial prompt using Context and Instruction or Recipe patterns.
  2. Test: Run the AI-generated code against your unit tests. Does it pass?
  3. Analyze Failures: If it fails, identify why. Was the input ambiguous? Were edge cases missing? Was the security constraint ignored?
  4. Refine: Update the prompt to address the failure. Add more specific pre-conditions, clarify examples, or tighten constraints.
  5. Standardize: Save successful prompt templates in a shared repository for your team.

This cycle ensures that your prompts evolve alongside your codebase. Over time, you’ll build a library of highly optimized prompts for common tasks like generating CRUD operations, writing integration tests, or refactoring legacy modules. This collective knowledge base accelerates development and maintains consistency across projects.

What is the difference between Chain-of-Thought and structured prompting for code?

Chain-of-Thought (CoT) asks the model to explain its reasoning step-by-step, which can improve accuracy but increases token cost and latency. Structured prompting (like Context and Instruction) provides clear constraints and examples upfront, leading to direct, accurate code generation with fewer tokens and less chance of hallucination. For production code, structured prompts are generally more efficient and reliable.

How do I ensure the AI writes code compatible with my specific framework version?

Explicitly state the framework and version in your prompt's context section. For example, "Using React 18 with functional components and hooks." Also, provide a small code snippet from your existing project as an example. This anchors the model to your specific syntax and conventions, preventing it from using outdated patterns like class components if you’re on a newer version.

Can I use these patterns for debugging existing code?

Yes. Use the Context and Instruction pattern. Provide the buggy code, the error message you’re seeing, and the expected behavior. Ask the AI to identify the root cause and suggest a fix. Including the stack trace and relevant surrounding code helps the model pinpoint the issue more accurately than just describing the bug in words.

Why are pre-conditions and post-conditions important in prompts?

They act as a contract for the AI. Pre-conditions define what the code can assume is true before it runs (e.g., inputs are not null). Post-conditions define what must be true after it runs (e.g., returns a specific type). Without these, the AI might generate code that assumes different defaults, leading to runtime errors or unexpected behavior in edge cases.

How do I handle large codebases when prompting for refactors?

Don’t paste the entire codebase. Focus on the specific module or function you want to refactor. Provide enough context about dependencies (e.g., "This function calls 'AuthService.getUser()' which returns a Promise"). If the refactor affects multiple files, break it down into smaller, targeted prompts for each file, ensuring consistency by referencing the same interface definitions in each prompt.

Write a comment