Prompt Engineering: What Makes a Good Prompt (Production Guide)

Every time I hear someone say “prompt engineering isn’t a real skill,” I know they haven’t tried to build a production LLM system. Writing prompts that work reliably across thousands of inputs, in a system where failures are expensive, is genuinely hard.

I’ve written prompts for customer-facing chatbots at Fortune 500 companies, for internal research agents at Microsoft, and for my own products. The difference between a prompt written by someone who understands the craft and one that doesn’t is often the difference between a feature that ships and one that gets cut.

What a bad prompt actually looks like

Here’s a prompt I see constantly:

You are a helpful assistant. Answer questions about our product.

That’s not a prompt - that’s an abdication of responsibility. It leaves almost every important decision to the model, and the model will make inconsistent decisions.

When this kind of prompt fails, it fails in every direction. Too verbose sometimes, too terse other times. Answers questions it shouldn’t. Refuses questions it should answer. Invents information it doesn’t have. The model isn’t broken - the instruction is.

A prompt is not a suggestion. It’s a specification. The more precisely you define the behavior you want, the more consistently you get it.

The principles that actually matter

1. Define the persona with specificity

“Helpful assistant” is not a persona. This is:

You are a senior customer success manager at a B2B SaaS company.
You have deep product knowledge but you never invent features that
don't exist. When you don't know something, you say so and offer to
connect the user with a specialist.

That’s a persona. It tells the model who it is, what it knows, and critically - how to behave at the edges of its knowledge.

2. Explicit output format beats implicit hope

If you need the model to return structured data, specify the structure. If you need a response under 100 words, say that explicitly. If you need numbered steps, show an example. Models are excellent at pattern-matching format - but only if you give them a pattern to match.

Give it a contract, not a wish:

{
  "answer": "string (<= 80 words, plain prose, no markdown)",
  "sources": ["string (doc_id)", "..."],
  "confidence": "low | medium | high",
  "followup_needed": "boolean"
}

A prompt that asks for this shape and shows one example of it will return it 99% of the time. A prompt that vaguely requests “a structured answer” will not.

3. Edge case handling belongs in the prompt, not in your code

I’ve seen engineers write elaborate post-processing code to handle cases where the LLM returns something unexpected. That post-processing is usually a sign that the prompt doesn’t handle the edge case. Define what the model should do when the user asks something off-topic, when the context is ambiguous, when the answer is “I don’t know.” Put it in the prompt. This becomes doubly important once you wrap the prompt inside an agent loop, where every bad output is an input to the next step.

4. Few-shot examples are worth a hundred instructions

If I want a model to respond in a particular tone and format, the fastest path is showing it two or three examples of that tone and format. “Good response: [example]. Bad response: [example].” Models generalize from examples better than they comply with abstract rules.

Production-specific considerations

Prompt versioning matters - treat prompts like code and track changes with the same rigor
Test your prompts against adversarial inputs, not just happy paths - users will try to jailbreak, confuse, or edge-case your system
Temperature is a prompt parameter too - most production prompts should run at low temperature for consistency
Context window management is a craft - what you include and exclude from the context shapes the output as much as the instruction itself
Evaluate outputs at scale - a prompt that passes 10 manual test cases can still fail badly on input #500; you need automated evaluation

The meta-skill

The best prompt engineers I know share one trait: they’re extremely good at specifying what they want. Not just for AI - in communication generally. A clear prompt is a clear thought made visible. The engineering part is iteration - writing, testing, observing failure modes, revising.

If you’re building AI systems and want to pressure-test your prompts or talk through prompt engineering strategy for your specific use case, bring it to an architecture session. Real problems, real examples.

Book a Session

Prompt Engineering is a Real Skill - Here's What Actually Makes a Good Prompt

What a bad prompt actually looks like

The principles that actually matter

1. Define the persona with specificity

2. Explicit output format beats implicit hope

3. Edge case handling belongs in the prompt, not in your code

4. Few-shot examples are worth a hundred instructions

Production-specific considerations

The meta-skill

Keep reading

Building Your First Agentic System - What Nobody Tells You Before You Start

RAG vs Fine-Tuning - How to Actually Decide

Filed under

Want to talk through this?