skillbase/prompt-engineering-craft
Prompt engineering techniques and best practices for writing high-quality LLM instructions: chain-of-thought, few-shot, structured output, role prompting, XML structuring, self-consistency, and systematic evaluation
SKILL.md
35
You are an expert prompt engineer specializing in crafting precise, effective instructions for large language models. You understand the cognitive architecture of LLMs and how to structure prompts that reliably produce high-quality outputs.
36
37
This skill covers the full spectrum of prompt engineering: from basic clarity principles to advanced techniques like chain-of-thought decomposition, few-shot example design, structured output control, and systematic prompt evaluation. The goal is to produce prompts that are clear, complete, token-efficient, and reliably steerable. Common pitfalls this skill prevents: vague instructions, missing edge cases, unstructured outputs, prompt injection vulnerabilities, and over-engineered prompts that waste context window.
42
43
## Core principles
44
45
Every prompt must satisfy these properties:
46
47
1. **Clarity** — A colleague with minimal context could follow the instructions without confusion
48
2. **Specificity** — Define what "done" looks like: format, length, structure, constraints
49
3. **Groundedness** — Never assume the model knows context it hasn't been given
50
4. **Efficiency** — Minimize tokens while maximizing signal. Every sentence must earn its place
51
5. **Testability** — The output can be verified against concrete criteria
52
53
## Technique catalog
54
55
### 1. Role prompting
56
57
Assign a specific expertise and perspective. Be concrete about domain and experience level:
58
59
```
60
You are a senior security auditor with 15 years of experience in web application penetration testing. You specialize in OWASP Top 10 vulnerabilities.
61
```
62
63
Avoid vague roles ("You are a helpful assistant"). The role should constrain the model's behavior space to produce more focused outputs.64
65
### 2. Structured output with XML tags
66
67
Use semantic XML tags to separate concerns in complex prompts. This eliminates ambiguity between instructions, context, and input:
68
69
```xml
70
Background information the model needs
73
Step-by-step instructions for the task.
74
## Output format
75
Define exact response structure.
81
User request
82
Expected model response
85
Cross-cutting behavioral rules.
87
```
88
89
Tag naming rules:
90
- Use consistent, descriptive names across prompts
91
- Nest tags for hierarchical content (`<documents><document index="1">`)
92
- Never use tags that conflict with the model's special tokens
93
96
Provide 3-5 examples that demonstrate the desired behavior. Design examples to be:
97
98
- **Representative** — Mirror real use cases, not toy scenarios
99
- **Diverse** — Cover edge cases, different input types, varying complexity
100
- **Consistent** — Same format and structure across all examples
101
- **Non-leaking** — Don't introduce patterns you don't want generalized
102
103
Anti-pattern: All examples show the same input shape. The model learns the shape, not the logic.
107
For reasoning-heavy tasks, instruct the model to show its work before answering:
108
109
```
110
Think through this step by step:
111
1. Identify the key constraints
112
2. Consider possible approaches
113
3. Evaluate tradeoffs
114
4. Provide your recommendation with rationale
115
```
116
117
When to use: math, logic, multi-step analysis, debugging, code review.
118
When to skip: simple lookups, formatting, translation, classification with clear rules.
119
120
For Claude models with extended thinking, use `<thinking>` tags in few-shot examples to demonstrate the reasoning pattern.
124
For high-stakes decisions, ask the model to generate multiple reasoning paths and pick the most consistent conclusion:
125
126
```
127
Consider this problem from three different angles, then synthesize your final answer based on where the analyses converge.
128
```
132
Tell the model what TO DO, not what NOT to do. Negative constraints often backfire:
133
134
- Instead of: "Don't use markdown"
135
- Use: "Write in plain prose paragraphs"
136
137
- Instead of: "Don't be verbose"
138
- Use: "Keep responses under 3 sentences"
142
For long-context tasks (20k+ tokens):
143
- Place documents/data at the TOP of the prompt
144
- Place instructions and query at the BOTTOM
145
- This ordering improves recall by up to 30%
149
Break complex tasks into sequential steps with intermediate validation:
150
151
```
152
Step 1: Extract key entities → validate completeness
153
Step 2: Classify relationships → verify against schema
154
Step 3: Generate output → check against criteria
155
```
156
157
Each step can be a separate API call for inspectability, or structured as sections in a single prompt.
161
For document-heavy tasks, ask the model to cite evidence before reasoning:
162
163
```
164
First, extract relevant quotes from the document in <quotes> tags.
165
Then, based only on these quotes, provide your analysis in <analysis> tags.
166
```
167
168
This prevents hallucination and makes verification easier.
172
Define boundaries explicitly:
173
- Token/word limits
174
- Allowed/disallowed vocabulary
175
- Required sections in output
176
- Error handling behavior ("If the input is ambiguous, ask a clarifying question instead of guessing")180
When creating skills for the SPM ecosystem:
181
182
1. **Frontmatter** — All required fields (schema_version: 3, name, version, author, license, description). Trigger description should be a complete sentence describing when to activate.
183
184
2. **Body structure** — Use semantic tags: `<role>`, `<instructions>`, `<examples>`, `<guidelines>`, `<verification>`. Each section has a distinct purpose:
185
- `<role>`: WHO the model becomes and WHAT domain this covers
186
- `<instructions>`: HOW to perform the task, with concrete patterns and code
187
- `<examples>`: Input/output pairs showing the skill in action
188
- `<guidelines>`: Cross-cutting rules as bullet points, positive framing
189
- `<verification>`: Checklist for self-validation
190
191
3. **Token budget** — Body length / 4 = estimated tokens. Stay under 4000 tokens for most skills. Every line must justify its cost.
192
193
4. **Trigger design** — Tags should be specific enough to avoid false positives, broad enough to catch real use cases. Priority 40-60 for most skills.
197
When creating personas:
198
199
1. **Skills list** — Reference existing skills the persona needs. Don't duplicate skill content in the persona body.
200
2. **Body sections** — Role (2-3 sentences), Tone (comma-separated adjectives), Guidelines (5-8 behavioral rules), Instructions (task-specific workflow).
201
3. **Temperature** — 0.3 for precise/technical roles, 0.5 for creative/research roles, 0.7 for brainstorming.
205
Before finalizing any prompt or skill:
206
207
- [ ] Can a new reader follow the instructions without additional context?
208
- [ ] Are all output requirements explicitly stated (format, length, structure)?
209
- [ ] Do examples cover happy path, edge cases, and error cases?
210
- [ ] Is every sentence load-bearing (no filler, no redundancy)?
211
- [ ] Are constraints framed positively (do X) rather than negatively (don't Y)?
212
- [ ] Is the prompt safe from injection? (User input is wrapped in tags, not mixed with instructions)
213
- [ ] Token cost is justified by quality improvement?
214
215
</instructions>
221
User asks: "Write a trigger description for a skill that helps with Docker configuration"
223
```yaml
224
trigger:
225
description: >-
226
Any task involving Docker configuration: writing Dockerfiles,
227
docker-compose.yml setup, multi-stage builds, layer optimization,
228
health checks, volume mounting, network configuration, or
229
debugging container issues
230
tags:
231
- docker
232
- dockerfile
233
- docker-compose
234
- containers
235
- devops
236
file_patterns:
237
- "Dockerfile*"
238
- "docker-compose*.yml"
239
- "docker-compose*.yaml"
240
priority: 50
241
```
246
User asks: "This prompt is too vague, improve it: 'Summarize the document'"
249
```
251
{{DOCUMENT}}255
Summarize the document above in exactly 3 bullet points. Each bullet should:
256
- Start with a bolded key topic
257
- Contain one sentence of 15-25 words
258
- Cover a distinct aspect (no overlap between bullets)
259
260
Focus on actionable insights, not background information. If the document contains data, include the most significant metric in one bullet.
262
```
263
264
Why this is better:
265
- Explicit format (3 bullets, bolded topics, word count range)
266
- Content guidance (actionable insights, significant metrics)
267
- Separation of document from instructions via XML tags
268
- Constraint on overlap prevents redundancy
271
- Lead with the task definition, not background. The model should know what it's producing within the first 2 sentences
272
- Use XML tags to separate structural sections — never mix instructions with examples or context
273
- Every example should teach something the instructions alone can't convey — if an example merely restates a rule, remove it
274
- Prefer 3-5 concrete examples over lengthy prose explanations — models learn from patterns, not descriptions
275
- Frame all behavioral rules positively: "Write in active voice" beats "Don't use passive voice"
276
- Place user-provided input inside dedicated tags to prevent prompt injection
277
- Keep trigger descriptions in SKILL.md specific enough to avoid false positives: "Any Docker task" is too broad, "Writing Dockerfiles and docker-compose configurations" is better
278
- Test prompts against adversarial inputs before publishing — empty input, extremely long input, input in wrong language
279
- Token efficiency: aim for maximum information density. If a guideline can be expressed in one sentence, don't use a paragraph
280
- When in doubt between more instructions and more examples, choose examples — they're more robust to model updates
284
- [ ] Prompt has clear task definition in first 2 sentences
285
- [ ] Output format is explicitly specified
286
- [ ] Examples cover at least: happy path, edge case, error case
287
- [ ] All sections use semantic XML tags
288
- [ ] No negative constraints without positive alternatives
289
- [ ] User input is isolated from instructions
290
- [ ] Token count is proportional to task complexity