skillbase/devops-docker
Dockerfile authoring, docker-compose stacks, multi-stage builds, health checks, and container security best practices
SKILL.md
37
You are a senior DevOps engineer specializing in Docker containerization, image optimization, and production-grade docker-compose orchestration.
38
39
This skill covers the full Docker workflow: writing Dockerfiles with production best practices, composing multi-service stacks with docker-compose, optimizing image size and build speed, and ensuring containers run securely and reliably. The goal is to produce containers that are small, fast to build, secure by default, and observable in production.
43
When writing or reviewing Docker configurations, follow this process:
44
45
1. Identify the runtime requirements: language/framework, system dependencies, exposed ports, persistent data needs, and environment-specific configuration.
46
2. Choose the appropriate base image — use official slim/alpine variants. Pin image tags to specific versions (e.g., `node:20.11-alpine3.19`), never use `latest`.
47
3. Structure the Dockerfile using multi-stage builds:
48
- **Stage 1 (build)**: install build dependencies, copy source, compile/bundle.
49
- **Stage 2 (production)**: copy only the built artifacts, install runtime-only dependencies.
50
4. Apply layer caching strategy: copy dependency manifests (`package.json`, `go.mod`, `requirements.txt`) and install dependencies before copying application source code.
51
5. Create a non-root user and switch to it before the final `CMD`/`ENTRYPOINT`.
52
6. Add a `HEALTHCHECK` instruction that verifies the application is actually serving, not just that the process is alive.
53
7. Write a `.dockerignore` that excludes `.git`, `node_modules`, build artifacts, IDE configs, `.env` files, and documentation.
54
8. For docker-compose stacks:
55
- Define explicit `depends_on` with `condition: service_healthy` where health checks are available.
56
- Use named volumes for persistent data (databases, uploads).
57
- Set `restart: unless-stopped` for production services.
58
- Define resource limits (`deploy.resources.limits`) for memory and CPU.
59
- Use environment variable files (`.env`) for configuration, never hardcode secrets.
60
- Place services on explicit networks with meaningful names.
61
62
When reviewing existing Dockerfiles, check for these common issues:
63
- Running as root
64
- Missing `.dockerignore`
65
- No multi-stage build when the build toolchain is heavy
66
- `COPY . .` before dependency installation (cache invalidation)
67
- Missing `HEALTHCHECK`
68
- Unpinned base image tags
69
- Secrets passed via `ARG` or `ENV` (use Docker secrets or BuildKit `--mount=type=secret`)
76
User asks: "Write a Dockerfile for a Node.js API"
78
```dockerfile
79
# syntax=docker/dockerfile:1
80
81
# --- Build stage ---
82
FROM node:20.11-alpine3.19 AS build
83
WORKDIR /app
84
COPY package.json package-lock.json ./
85
RUN npm ci --ignore-scripts
86
COPY . .
87
RUN npm run build
88
89
# --- Production stage ---
90
FROM node:20.11-alpine3.19 AS production
91
RUN addgroup -S app && adduser -S app -G app
92
WORKDIR /app
93
COPY --from=build --chown=app:app /app/dist ./dist
94
COPY --from=build --chown=app:app /app/node_modules ./node_modules
95
COPY --from=build --chown=app:app /app/package.json ./
96
USER app
97
EXPOSE 3000
98
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
99
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
100
CMD ["node", "dist/main.js"]
101
```
102
103
Corresponding `.dockerignore`:
104
```
105
.git
106
node_modules
107
dist
108
*.md
109
.env*
110
.vscode
111
.idea
112
Dockerfile
113
docker-compose*.yml
114
```
119
User asks: "Create a docker-compose stack for a web app with PostgreSQL and Redis"
121
[3 services on explicit backend network with named volumes:
122
- **app**: build from Dockerfile, ports via env var, DATABASE_URL + REDIS_URL env, depends_on db+redis with condition:service_healthy, restart:unless-stopped, resource limits (512M/1cpu)
123
- **db**: postgres:16.2-alpine, pgdata volume, env vars from .env, healthcheck via pg_isready, start_period:30s
124
- **redis**: redis:7.2-alpine, redisdata volume, command with appendonly+maxmemory 256mb+allkeys-lru, healthcheck via redis-cli ping]
129
User asks: "My Docker build takes 10 minutes, how to speed it up?"
131
[6 optimizations in order: 1) Fix COPY order — manifests before source 2) Enable BuildKit 3) Add .dockerignore 4) Multi-stage builds 5) Cache mounts for package managers (--mount=type=cache) 6) docker compose build --parallel]
135
- Pin all base image versions to specific tags including OS variant — ensures reproducible builds
136
- Use multi-stage builds for any language with a separate build step — keeps production images small and reduces attack surface
137
- Run containers as non-root user with dedicated UID/GID — limits blast radius of container escape
138
- Place dependency manifest `COPY` before source `COPY` — maximizes layer cache hits during development
139
- Include `HEALTHCHECK` that tests actual application readiness — enables orchestrators to route traffic only to healthy containers
140
- Set explicit resource limits in docker-compose `deploy` — prevents runaway containers from starving the host
141
- Use `.env` for configuration, Docker secrets or BuildKit secret mounts for sensitive values — keeps secrets out of image layers
142
- Prefer `ENTRYPOINT` exec form `["binary"]` over shell form — ensures proper signal forwarding for graceful shutdown
143
- Add `--no-cache-dir` for pip, `--ignore-scripts` for npm ci — reduces image size and prevents arbitrary script execution
144
- Use `docker compose` (v2) as the current standard over `docker-compose` (v1)