Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

December 2025 Update: Covers chain-of-thought, few-shot learning, system prompts, and the latest prompting techniques from OpenAI and Anthropic research.

Why Prompts Matter

If the model is the engine, the prompt is the steering wheel. The same GPT-4o model that gives a confused, rambling answer to a vague question will give a precise, well-structured answer to a well-crafted prompt. This is not a minor difference — it is the difference between a product that works and one that doesn’t. The difference between a junior and senior AI engineer often comes down to prompt engineering. A well-crafted prompt can:
  • Turn a 0.10GPT4ocallintoa0.10 GPT-4o call into a 0.001 GPT-4o-mini call (a simpler model with a great prompt often beats a powerful model with a bad one)
  • Reduce hallucinations by 90% (by constraining the model’s output space)
  • Get structured, predictable outputs every time (making your parser happy)
The 80/20 Rule: 80% of prompt quality comes from clear instructions and examples. The remaining 20% is advanced techniques.

The Anatomy of a Great Prompt

┌─────────────────────────────────────────────────────────────┐
│                      SYSTEM PROMPT                          │
│  • Role/Persona definition                                  │
│  • Capabilities and constraints                             │
│  • Output format requirements                               │
│  • Rules and guidelines                                     │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                      FEW-SHOT EXAMPLES                      │
│  • 2-5 input/output pairs                                   │
│  • Cover edge cases                                         │
│  • Show exact format expected                               │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                      USER INPUT                             │
│  • Clear, specific request                                  │
│  • Relevant context included                                │
│  • Output format reminder (optional)                        │
└─────────────────────────────────────────────────────────────┘

System Prompts: Your AI’s DNA

The system prompt is the most important piece of text in your entire application. It runs on every single request, shapes every response, and defines the personality, capabilities, and constraints of your AI. Think of it as the job description you give to a new employee — the more specific and clear it is, the better the work you get back. A vague system prompt like “You are a helpful assistant” is like hiring someone and saying “just do good work.”

Basic Structure

SYSTEM_PROMPT = """You are an expert {role} with deep knowledge of {domain}.

## Your Capabilities
- {capability_1}
- {capability_2}
- {capability_3}

## Rules
1. Always {rule_1}
2. Never {rule_2}
3. When uncertain, {uncertainty_behavior}

## Output Format
{format_specification}
"""

Production System Prompt

CODE_REVIEW_PROMPT = """You are a senior software engineer performing code review.

## Your Expertise
- Python, JavaScript, TypeScript, Go
- Clean code principles and SOLID
- Security best practices
- Performance optimization

## Review Process
1. First, understand the code's purpose
2. Check for bugs and logic errors
3. Evaluate code quality and readability
4. Identify security vulnerabilities
5. Suggest performance improvements

## Rules
- Be constructive, not critical
- Explain WHY something is an issue
- Provide specific, actionable fixes
- Praise good patterns when you see them
- If code is good, say so briefly

## Output Format
Return a JSON object:
{
  "summary": "One-line summary of the code quality",
  "issues": [
    {
      "severity": "critical|major|minor|suggestion",
      "line": <line_number or null>,
      "issue": "Description of the problem",
      "fix": "Suggested solution with code"
    }
  ],
  "positive": ["List of things done well"],
  "score": <1-10>
}
"""

Few-Shot Learning

Why Few-Shot Works

Few-shot learning exploits the fact that LLMs are incredible pattern-matchers. When you show the model 2-5 examples of input/output pairs, it infers the underlying pattern and applies it to new inputs. This is often more effective than paragraphs of written instructions because it is unambiguous — the model can see exactly what you expect rather than interpreting your natural language description. The analogy: imagine explaining to someone how to tie a specific knot using only words vs. showing them three examples. The examples win every time. LLMs learn patterns from examples. 2-5 examples can:
  • Define exact output format (the model mimics the structure it sees)
  • Show edge case handling (include one tricky example to prevent common failures)
  • Reduce ambiguity dramatically (examples are specifications, not descriptions)

Few-Shot Template

def create_few_shot_prompt(task: str, examples: list[dict], query: str) -> str:
    prompt = f"Task: {task}\n\n"
    prompt += "Examples:\n"
    
    for i, ex in enumerate(examples, 1):
        prompt += f"\nExample {i}:\n"
        prompt += f"Input: {ex['input']}\n"
        prompt += f"Output: {ex['output']}\n"
    
    prompt += f"\nNow complete this:\nInput: {query}\nOutput:"
    return prompt

# Example: Sentiment Analysis
examples = [
    {"input": "This product is amazing!", "output": "positive"},
    {"input": "Terrible experience, want refund", "output": "negative"},
    {"input": "It's okay, nothing special", "output": "neutral"},
    {"input": "Love the design but shipping was slow", "output": "mixed"},
]

prompt = create_few_shot_prompt(
    task="Classify the sentiment of the review",
    examples=examples,
    query="Best purchase I've made this year, highly recommend!"
)

Few-Shot Edge Cases

Edge case — example ordering matters: Models exhibit recency bias — the last example in your few-shot set has the most influence. If your examples include one edge case and two normal cases, put the edge case last. Conversely, if your examples are imbalanced (4 positive, 1 negative), the model will be biased toward positive classification. Edge case — examples that are too similar: If all your few-shot examples are short sentences about weather, the model may infer that it should only produce short weather-related outputs. Include diverse examples that vary in length, topic, and complexity to teach the model the general pattern rather than a narrow one. Edge case — when few-shot hurts: For very simple tasks or when you need maximum output diversity, few-shot examples can be counterproductive — the model over-indexes on the specific patterns in your examples. If you notice the model parroting your examples too closely, reduce to 1-2 examples or switch to zero-shot with explicit instructions.

Chain-of-Thought (CoT)

The Problem

LLMs often fail at multi-step reasoning when asked to jump straight to the answer. This is because the model generates one token at a time, and each token is a “thinking step.” When you ask for just the final answer, you are asking the model to solve the entire problem in a single forward pass — like asking someone to solve a complex equation in their head without writing anything down.

The Solution

Force the model to “show its work” before answering. Each intermediate token becomes a reasoning step, and the model can attend to its own previous reasoning. This is not a hack — it genuinely improves accuracy because the model gets more computation to work with.
# ❌ Bad: Direct answer
prompt = "What is 23 * 47 + 156 / 4?"

# ✅ Good: Chain of thought
prompt = """What is 23 * 47 + 156 / 4?

Let's solve this step by step:
1. First, calculate 23 * 47
2. Then, calculate 156 / 4
3. Finally, add the results

Show your work:"""

Zero-Shot CoT

Just add “Let’s think step by step” to any prompt:
REASONING_SUFFIX = "\n\nLet's approach this step by step:"

def add_cot(prompt: str) -> str:
    return prompt + REASONING_SUFFIX

Structured CoT

COT_PROMPT = """
{question}

## Analysis Framework
1. **Understand**: What is being asked?
2. **Identify**: What information do we have?
3. **Plan**: What steps are needed?
4. **Execute**: Work through each step
5. **Verify**: Does the answer make sense?

## Solution
"""

Advanced Techniques

Self-Consistency

Self-consistency is like asking five experts the same question and going with the majority answer. It exploits the fact that LLMs are non-deterministic at temperature > 0: different “reasoning paths” may lead to different answers, but the correct answer tends to appear more frequently. This technique is particularly powerful for math, logic, and classification tasks where there is a single correct answer. Run the same prompt multiple times and take the majority answer:
from collections import Counter
from openai import OpenAI

client = OpenAI()

def self_consistent_answer(prompt: str, n: int = 5) -> str:
    """Generate multiple answers and return the most common one"""
    answers = []
    
    for _ in range(n):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.7  # Some randomness needed
        )
        answers.append(response.choices[0].message.content.strip())
    
    # Return most common answer
    counter = Counter(answers)
    return counter.most_common(1)[0][0]

Prompt Chaining

Prompt chaining is the “divide and conquer” of prompt engineering. Instead of asking one prompt to do everything (research, outline, write, edit), you break the task into steps where each prompt does one thing well. Each step produces output that becomes input for the next step. This works because LLMs excel at focused, well-defined tasks and struggle with vague, multi-step instructions. Break complex tasks into sequential prompts:
async def research_and_write(topic: str) -> str:
    """Chain: Research → Outline → Write → Edit"""
    
    # Step 1: Research
    research = await llm_call(f"""
    Research the topic: {topic}
    List 5-7 key points with sources.
    """)
    
    # Step 2: Outline
    outline = await llm_call(f"""
    Based on this research:
    {research}
    
    Create a detailed article outline with sections and subsections.
    """)
    
    # Step 3: Write
    draft = await llm_call(f"""
    Write a comprehensive article following this outline:
    {outline}
    
    Use the research for accuracy. Target: 1500 words.
    """)
    
    # Step 4: Edit
    final = await llm_call(f"""
    Edit this article for clarity, flow, and engagement:
    {draft}
    
    Fix any errors. Improve transitions. Make it compelling.
    """)
    
    return final

Role Prompting

Role prompting is one of the simplest techniques that delivers outsized results. By assigning the model a specific expert persona with detailed credentials, you activate the model’s knowledge in that domain and bias its outputs toward that perspective. A generic “review this code” prompt gets generic advice. A “you are a security expert at Google who has found thousands of vulnerabilities” prompt gets security-focused, specific feedback. Assign specific expertise for better outputs:
EXPERT_ROLES = {
    "security": "You are a cybersecurity expert with 15 years of experience at Google. You've reviewed thousands of codebases for vulnerabilities.",
    
    "performance": "You are a performance engineer who optimized systems handling 1M+ requests/second at Netflix. You think in terms of latency percentiles and resource efficiency.",
    
    "architecture": "You are a principal architect who designed microservices at scale for Amazon. You balance pragmatism with technical excellence.",
    
    "ml": "You are a machine learning researcher from DeepMind. You understand both theoretical foundations and practical implementation details."
}

def expert_review(code: str, expertise: str) -> str:
    role = EXPERT_ROLES.get(expertise, "You are a senior software engineer.")
    return f"{role}\n\nReview this code:\n```\n{code}\n```"

Constitutional AI (Self-Critique)

This technique, pioneered by Anthropic, uses the model as its own editor. The process is: generate a response, ask the model to critique it against a set of principles, then ask the model to revise based on the critique. It is surprisingly effective because the model is better at spotting problems in existing text than avoiding them during initial generation — the same way a writer is better at editing than first-drafting. Have the model critique and improve its own output:
def constitutional_response(query: str, principles: list[str]) -> str:
    # Initial response
    response = llm_call(query)
    
    # Critique against principles
    critique_prompt = f"""
    Original query: {query}
    Response: {response}
    
    Evaluate this response against these principles:
    {chr(10).join(f'- {p}' for p in principles)}
    
    What could be improved?
    """
    critique = llm_call(critique_prompt)
    
    # Revise based on critique
    revision_prompt = f"""
    Original response: {response}
    Critique: {critique}
    
    Provide an improved response addressing the critique.
    """
    
    return llm_call(revision_prompt)

# Usage
principles = [
    "Be helpful and accurate",
    "Avoid harmful content", 
    "Acknowledge uncertainty",
    "Cite sources when possible"
]

Prompt Templates Library

Summarization

SUMMARIZE_PROMPT = """Summarize the following text in {length} sentences.

Focus on:
- Main arguments/findings
- Key data points
- Actionable conclusions

Text:
{text}

Summary:"""

Data Extraction

EXTRACT_PROMPT = """Extract structured data from this text.

Text: {text}

Extract the following fields (use null if not found):
{fields}

Return as JSON:"""

Classification

CLASSIFY_PROMPT = """Classify the following into one of these categories: {categories}

Guidelines:
{guidelines}

Text: {text}

Category:"""

Translation with Context

TRANSLATE_PROMPT = """Translate the following from {source_lang} to {target_lang}.

Context: {context}
Tone: {tone}
Domain: {domain}

Original: {text}

Translation:"""

Debugging Prompts

Prompt debugging is an art that improves with practice. The most common mistake is changing too many things at once — you tweak the system prompt, add examples, and change the temperature all in one go, and now you have no idea which change helped (or hurt). The scientific approach: change one variable at a time, test against a consistent set of inputs, and keep a log of what you tried and what happened.

Common Issues and Fixes

ProblemCauseSolutionExample Fix
Too verboseNo length constraintAdd “in X sentences” or “max Y words""Respond in exactly 3 bullet points, max 20 words each.”
Wrong formatAmbiguous instructionsAdd few-shot examples showing exact formatInclude 2-3 input/output pairs
HallucinationsAsking for unknown factsAdd “If unsure, say ‘I don’t know’""Only use information from the provided context. If the answer is not in the context, say ‘Not found in provided documents.’”
Inconsistent outputsHigh temperatureSet temperature=0 for determinismAlso add seed=42 for reproducibility
Off-topic responsesWeak system promptAdd explicit constraints and boundaries”You ONLY answer questions about Python. For other topics, respond: ‘I can only help with Python questions.’”
Ignores instructionsInstructions buried in long contextMove critical instructions to the end (recency bias)Place format requirements and constraints after the context, not before
Refusals on safe contentOverly cautious modelReframe the request to clarify legitimate intent”As a cybersecurity educator, explain how SQL injection works so developers can defend against it.”
Mixes languagesNo language constraintExplicitly specify output language”Respond in English regardless of the input language.”

Technique Selection Guide

Choosing the right prompting technique is as important as writing a good prompt. Use this decision framework:
TechniqueBest ForCost ImpactWhen to Avoid
Zero-shotSimple, well-defined tasksLowest (no examples)When output format is ambiguous
Few-shot (2-5 examples)Format-sensitive tasks, classificationModerate (examples consume tokens)When examples do not fit in context window
Chain-of-thoughtMath, logic, multi-step reasoningHigher (model produces more tokens)Simple factual lookups (adds unnecessary latency)
Self-consistencyTasks with a single correct answer5x cost (runs N times)Creative tasks with no “correct” answer
Prompt chainingComplex multi-stage workflowsHighest (multiple API calls)Simple tasks that fit in one prompt
Role promptingDomain-specific expertiseNone (just changes system prompt)When you need the model to be neutral/unbiased
Constitutional AISafety-critical outputs3x cost (generate + critique + revise)High-volume, low-risk tasks
Rule of thumb: Start with zero-shot. If quality is insufficient, add few-shot examples. If reasoning is wrong, add chain-of-thought. Only reach for self-consistency or chaining when simpler techniques fail on your specific task.

Prompt Testing Framework

from dataclasses import dataclass
from typing import Callable

@dataclass
class PromptTest:
    name: str
    input: str
    expected_contains: list[str] = None
    expected_not_contains: list[str] = None
    validator: Callable[[str], bool] = None

def test_prompt(prompt_template: str, tests: list[PromptTest]) -> dict:
    results = {"passed": 0, "failed": 0, "details": []}
    
    for test in tests:
        prompt = prompt_template.format(input=test.input)
        response = llm_call(prompt)
        
        passed = True
        errors = []
        
        if test.expected_contains:
            for phrase in test.expected_contains:
                if phrase.lower() not in response.lower():
                    passed = False
                    errors.append(f"Missing: {phrase}")
        
        if test.expected_not_contains:
            for phrase in test.expected_not_contains:
                if phrase.lower() in response.lower():
                    passed = False
                    errors.append(f"Should not contain: {phrase}")
        
        if test.validator and not test.validator(response):
            passed = False
            errors.append("Custom validation failed")
        
        results["passed" if passed else "failed"] += 1
        results["details"].append({
            "test": test.name,
            "passed": passed,
            "errors": errors
        })
    
    return results

Example Prompts Library

These are battle-tested prompts adapted from the open-source community. Study their structure — notice how each one defines a clear role, sets explicit constraints, and specifies the output format. The best prompts are not creative writing; they are precise specifications. Adapted from Awesome ChatGPT Prompts.

Act as a Linux Terminal

I want you to act as a Linux terminal. I will type commands and you will reply 
with what the terminal should show. I want you to only reply with the terminal 
output inside one unique code block, and nothing else. Do not write explanations. 
Do not type commands unless I instruct you to do so. When I need to tell you 
something in English, I will do so by putting text inside curly brackets {like this}. 
My first command is pwd

Act as a Tech Interviewer

I want you to act as an interviewer. I will be the candidate and you will ask 
me the interview questions for the position of [Senior Backend Engineer]. 
I want you to only reply as the interviewer. Do not write all the conversation at once. 
I want you to only do the interview with me. Ask me the questions and wait for my 
answers. Do not write explanations. Ask me the questions one by one like an 
interviewer does and wait for my answers. My first sentence is "Hi"

Act as a SQL Expert

I want you to act as a SQL expert. I have a database with the following tables:
- users (id, name, email, created_at)
- orders (id, user_id, total, status, created_at)
- products (id, name, price, category)
- order_items (id, order_id, product_id, quantity)

When I describe what I want, write the SQL query to achieve it. 
Explain your query briefly. Optimize for readability first, then performance.

Act as a Code Reviewer

I want you to act as a senior code reviewer. Review the code I provide and:
1. Identify bugs and potential issues
2. Suggest improvements for readability and maintainability
3. Point out security vulnerabilities
4. Recommend performance optimizations

Be constructive and explain WHY something is an issue. Provide specific fixes.
Rate the overall code quality from 1-10.

Act as a UX/UI Developer

I want you to act as a UX/UI developer. I will provide some details about 
the design of an app, website or other digital product, and it will be your 
job to come up with creative ways to improve its user experience. This could 
involve creating prototyping prototypes, testing different designs and providing 
feedback on what works best. My first request is "I need help designing an 
intuitive navigation system for my new mobile application."

Act as a Regex Generator

I want you to act as a regex generator. Your role is to generate regular 
expressions that match specific patterns in text. You should provide the 
regex in a format that can be easily copied and pasted into a regex-enabled 
text editor or programming language. Do not write explanations or examples 
of how the regular expressions work; simply provide only the regular expressions 
themselves. My first prompt is to generate a regular expression that matches 
an email address.

Act as a Commit Message Generator

I want you to act as a commit message generator. I will provide you with 
information about the task and the prefix for the task code, and I would 
like you to generate an appropriate commit message using the conventional 
commit format. Do not write any explanations or other words, just reply 
with the commit message.

Format: <type>(<scope>): <subject>
Types: feat, fix, docs, style, refactor, test, chore

Act as a Prompt Optimizer

I want you to act as a prompt engineer. I will provide you with a prompt, 
and your job is to improve it for better LLM performance. Consider:
1. Clarity and specificity
2. Adding relevant context
3. Including output format
4. Adding few-shot examples if helpful
5. Breaking complex tasks into steps

Explain what you changed and why. Then provide the optimized prompt.

Act as a Diagram Generator (Mermaid)

I want you to act as a Mermaid diagram generator. Create diagrams based 
on my descriptions using Mermaid syntax. Support flowcharts, sequence 
diagrams, class diagrams, and entity relationship diagrams. Output only 
the Mermaid code wrapped in a code block. Do not add explanations unless asked.

Act as a Technical Writer

I want you to act as a tech writer. You will act as a creative and engaging 
technical writer and create guides on how to do different things. I will 
provide you with a topic and you will write:
1. A clear introduction explaining the topic
2. Step-by-step instructions
3. Code examples where relevant
4. Common pitfalls and how to avoid them
5. A brief summary

Use markdown formatting. My first topic is: [topic]
Find 200+ more prompts at prompts.chat - an open-source collection of prompts for various use cases.

Key Takeaways

Be Specific

Vague prompts get vague answers. Specify format, length, tone, and constraints.

Show, Don't Tell

Few-shot examples are worth a thousand words of instruction.

Think in Steps

Chain-of-thought improves reasoning. Break complex tasks into chains.

Test and Iterate

Prompts need testing like code. Build a test suite for critical prompts.

What’s Next

OpenAI API

Apply your prompt engineering skills with the OpenAI API