Appearance
Chapter 21: Shipping and Deployment
A great product that cannot ship reliably is just a great idea trapped on a developer's laptop. Your deployment pipeline is as important as your code.
Why This Matters
- Owner: Deployment speed directly equals revenue speed. Teams that deploy daily ship features 200x more often than teams that deploy monthly. Every hour of downtime costs real money.
- Dev: A good CI/CD pipeline means you push code and go home confident. A bad one means late-night hotfix anxiety and weekend war rooms.
- PM: Understanding deployment mechanics helps you plan releases, communicate timelines, and manage stakeholder expectations about when features go live.
- Designer: Feature flags let you test designs with real users incrementally. Understanding rollout strategies enables better experiment design.
The Concept (Simple)
Think of shipping software like a restaurant kitchen:
- CI/CD Pipeline = The assembly line from ingredient prep to plate delivery. Automated, consistent, fast.
- Feature Flags = Serving a new dish to only one table first to see if they like it before adding it to the full menu.
- Blue-Green Deployment = Having two identical kitchens; you cook in one while customers eat from the other, then swap.
- Canary Deployment = Giving the new dish to 5% of diners. If no one gets sick, roll it out to everyone.
The goal: ship fast, ship safely, and roll back instantly when something breaks.
How It Works (Detailed)
CI/CD Pipeline Design
┌─────────────────────────────────────────────────────────────────────┐
│ CI/CD PIPELINE │
│ │
│ Developer pushes code │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ CONTINUOUS INTEGRATION │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │
│ │ │ Lint & │──▶│ Unit │──▶│ Integr. │──▶│ Build │ │ │
│ │ │ Format │ │ Tests │ │ Tests │ │ Artifact│ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └─────────┘ │ │
│ │ │ │ │ │ │ │
│ │ ▼ ▼ ▼ ▼ │ │
│ │ [FAIL: Block] [FAIL: Block] [FAIL: Block] [FAIL: Block]│ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ ALL PASS │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ CONTINUOUS DELIVERY │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │
│ │ │ Deploy │──▶│ Smoke │──▶│ Deploy │──▶│ Health │ │ │
│ │ │ Staging │ │ Tests │ │ Prod │ │ Check │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └─────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ [Auto-rollback on failure] │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Monitor & │ │
│ │ Alert │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────┘CI pipeline stages in detail:
| Stage | Purpose | Tools | Duration Target |
|---|---|---|---|
| Lint & Format | Code style consistency | ESLint, Prettier, Ruff, Black | < 1 min |
| Unit Tests | Individual function correctness | Jest, pytest, Go test | < 3 min |
| Integration | Component interaction correctness | Supertest, Testcontainers | < 5 min |
| Build | Compile, bundle, create artifact | Docker, Webpack, Turbopack | < 3 min |
| Security Scan | Dependency vulnerabilities | Snyk, Trivy, npm audit | < 2 min |
| Total CI | Merge confidence | < 15 min |
The 15-minute rule: If your CI pipeline takes longer than 15 minutes, developers will context-switch and lose flow. Optimize ruthlessly.
Deployment Strategies
STRATEGY 1: Rolling Deployment
┌──────────────────────────────────────────────┐
│ Time 0: [v1] [v1] [v1] [v1] │
│ Time 1: [v2] [v1] [v1] [v1] ◄ replace │
│ Time 2: [v2] [v2] [v1] [v1] one by │
│ Time 3: [v2] [v2] [v2] [v1] one │
│ Time 4: [v2] [v2] [v2] [v2] ◄ done │
└──────────────────────────────────────────────┘
STRATEGY 2: Blue-Green Deployment
┌──────────────────────────────────────────────┐
│ Load Balancer │
│ │ │
│ ┌─────────┴─────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ BLUE (v1) │ │ GREEN (v2) │ │
│ │ [Live traffic]│ │ [Idle/Testing]│ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ After validation, swap: │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ BLUE (v1) │ │ GREEN (v2) │ │
│ │ [Idle] │ │ [Live traffic]│ │
│ └─────────────────┘ └─────────────────┘ │
│ Rollback: just swap back. Instant. │
└──────────────────────────────────────────────┘
STRATEGY 3: Canary Deployment
┌──────────────────────────────────────────────┐
│ Load Balancer │
│ ┌───────┴────────┐ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌────────────┐ │
│ │ v1 (95%) │ │ v2 (5%) │ │
│ │ Main fleet │ │ Canary │ │
│ └──────────────────┘ └────────────┘ │
│ │
│ Monitor canary metrics: │
│ - Error rate ✓ Same or lower │
│ - Latency ✓ Same or lower │
│ - Business KPIs ✓ No regression │
│ │
│ If healthy: 5% → 25% → 50% → 100% │
│ If unhealthy: auto-rollback to 0% │
└──────────────────────────────────────────────┘Deployment Strategy Comparison
| Strategy | Zero Downtime | Rollback Speed | Cost | Complexity | Best For |
|---|---|---|---|---|---|
| Recreate | No | Slow (redeploy) | Lowest | Lowest | Dev/staging environments |
| Rolling | Yes | Medium | Same | Low | Stateless services |
| Blue-Green | Yes | Instant | 2x infra | Medium | Critical production apps |
| Canary | Yes | Fast | +5-10% | High | High-traffic, risk-averse |
| Feature Flag | Yes | Instant (toggle) | Same | Medium | Gradual feature rollouts |
Feature Flags Strategy
Feature flags decouple deployment from release. You can deploy code to production without users seeing it.
FEATURE FLAG LIFECYCLE:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ CREATE │──▶│ TEST │──▶│ ROLLOUT │──▶│ REMOVE │
│ │ │ │ │ │ │ │
│ Flag: OFF │ │ Flag: ON │ │ Flag: ON │ │ Flag code │
│ for all │ │ internal │ │ 5%→25%→ │ │ removed │
│ │ │ only │ │ 50%→100% │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
▲
IMPORTANT!
Remove flags after
full rollout. Flag
debt is real debt.Feature Flag Use Cases
| Use Case | Flag Type | Example |
|---|---|---|
| Gradual rollout | Percentage | Show new dashboard to 10% of users |
| Beta program | User list | Enable for users who opted into beta |
| Plan gating | Attribute | Enable feature only for Enterprise plan |
| Kill switch | Boolean | Disable payment processing if provider is down |
| A/B testing | Experiment | Test two checkout flows, measure conversion |
| Operational toggle | Boolean | Disable expensive background job during peak load |
| Trunk-based dev | Boolean | Merge incomplete feature to main behind flag |
Flag naming convention:
Format: <team>.<feature>.<variant>
Examples:
billing.new-checkout.enabled
dashboard.chart-redesign.percentage
api.rate-limit-v2.enabled
ops.heavy-report.kill-switchWarning: Feature flags are technical debt the moment they are fully rolled out. Track all active flags. Set expiration dates. Schedule cleanup sprints quarterly.
Release Management Process
┌─────────────────────────────────────────────────────────────┐
│ RELEASE LIFECYCLE │
│ │
│ Development ──▶ Code Review ──▶ CI Pass ──▶ Staging │
│ │ │ │ │ │
│ │ ┌────┘ │ ┌────┘ │
│ ▼ ▼ ▼ ▼ │
│ Feature PR approved All green QA + Smoke │
│ branch by 1+ reviewer automated tests pass │
│ created checks │
│ │
│ Staging ──▶ Release Decision ──▶ Production ──▶ Monitor │
│ │ │ │ │ │
│ ┌────┘ ┌────┘ ┌────┘ ┌────┘ │
│ ▼ ▼ ▼ ▼ │
│ Product Go/No-Go Deploy with Watch error │
│ sign-off meeting canary or rates, latency, │
│ (if needed) (major releases) blue-green business KPIs │
│ for 30-60 min │
└─────────────────────────────────────────────────────────────┘Release Checklist
Use this before every production deployment:
Pre-Deploy:
- [ ] All CI checks pass on the release branch/commit
- [ ] Database migrations tested on staging with production-like data
- [ ] Feature flags configured for any new features
- [ ] Rollback plan documented (what to do if things break)
- [ ] On-call engineer identified and available
- [ ] No other deployments in progress
- [ ] Changelog updated (for customer-facing changes)
During Deploy:
- [ ] Deploy to canary/blue-green target
- [ ] Run automated smoke tests against production
- [ ] Verify health check endpoints return healthy
- [ ] Check error tracking dashboard (Sentry, Bugsnag) for new errors
- [ ] Confirm key user flows work (login, core action, payment)
Post-Deploy (next 30-60 minutes):
- [ ] Monitor error rates -- should be equal to or lower than pre-deploy
- [ ] Monitor latency p50/p95/p99 -- should be stable
- [ ] Monitor key business metrics (signups, conversions, API calls)
- [ ] Confirm background jobs are processing normally
- [ ] If anomalies detected: rollback immediately, investigate later
Post-Deploy (next 24 hours):
- [ ] Review any new error reports from users
- [ ] Check billing and subscription flows if payment code changed
- [ ] Clean up any temporary feature flags if fully rolled out
- [ ] Update status page if there was any user impact
Environment Strategy
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Local │────▶│ Dev │────▶│ Staging │────▶│ Prod │
│ │ │ │ │ │ │ │
│ Developer │ │ Shared │ │ Prod │ │ Real │
│ laptop │ │ testing │ │ mirror │ │ users │
│ │ │ │ │ │ │ │
│ Seed data │ │ Test │ │ Sanitized│ │ Real │
│ │ │ data │ │ prod data│ │ data │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
Key rules:
- Local should spin up with ONE command (docker compose up)
- Staging should mirror prod config exactly
- Never use real customer data in non-prod environments
- Each environment has its own secrets and API keysIn Practice
Real-World Example: Shipping a Pricing Page Redesign
Week 1: Dev builds new pricing page behind feature flag marketing.pricing-redesign.enabled. Code is merged to main and deployed to production. Flag is OFF for all users.
Week 2: QA enables flag on staging. Tests all plan combinations, mobile responsive, dark mode. Finds a bug with annual pricing toggle. Dev fixes and deploys.
Week 3: Flag enabled for internal team (dogfooding). PM and designer review in production context. Minor copy tweaks deployed same day.
Week 4: Canary rollout -- 10% of visitors see new page. Analytics compared: conversion rate, bounce rate, average revenue per user. New page shows +8% conversion. Rolled to 50%.
Week 5: Full rollout to 100%. Feature flag code removed in a cleanup PR. Old pricing page code deleted.
Total production incidents: 0. Total rollbacks: 0. Confidence: high.
Common Mistakes
- No staging environment -- deploying untested code directly to production is gambling with customer trust
- Manual deployments -- "SSH into the server and run the script" does not scale and introduces human error
- Feature flag debt -- forgetting to remove flags leads to thousands of stale conditionals in your codebase
- Deploying on Fridays -- unless your monitoring and on-call are excellent, avoid end-of-week deploys
- Skipping database migration testing -- a bad migration can take down your entire application and corrupt data
- No rollback plan -- every deploy should have a documented "if this breaks, do X" plan
Key Takeaways
- CI/CD is not optional; automate everything from lint to deployment by the end of month one
- Keep your CI pipeline under 15 minutes or developers will route around it
- Use feature flags to decouple deployment from release; ship code dark, then light it up gradually
- Blue-green or canary deployments eliminate downtime and provide instant rollback
- Every deployment needs a checklist, a rollback plan, and 30-60 minutes of post-deploy monitoring
- Remove feature flags after full rollout; flag debt accumulates silently
Action Items
Owner
- [ ] Invest in CI/CD tooling early -- it pays for itself within the first month
- [ ] Establish a "no manual deploys" policy from day one
- [ ] Ensure there is always an on-call engineer for production deployments
- [ ] Review deployment frequency monthly -- it is a proxy for team health
Dev
- [ ] Set up a CI/CD pipeline with lint, test, build, and deploy stages in the first week of the project
- [ ] Implement feature flags for any user-facing change (see Chapter 22: Security and Compliance for secure flag handling)
- [ ] Create a one-command local development setup (
docker compose upor equivalent) - [ ] Write smoke tests that run against production after every deploy
- [ ] Set up automated rollback triggers based on error rate thresholds
PM
- [ ] Include feature flag rollout plans in every feature spec
- [ ] Schedule quarterly flag cleanup sprints
- [ ] Use feature flags for A/B testing before committing to a design direction
- [ ] Track deployment frequency and lead time as team health metrics
Designer
- [ ] Design features with gradual rollout in mind -- how does partial availability look?
- [ ] Use feature flags to test design variations with real users
- [ ] Plan for the "flag off" experience -- what do users see before the feature is enabled?
- [ ] Coordinate with PM on experiment design for A/B tests (see Chapter 19: SaaS Architecture 101 for how architecture supports experimentation)