DevOps & CI/CD
What You’ll Learn
By the end of this chapter, you’ll understand:- What DevOps actually means - Why companies deploy software 100x faster than they did 10 years ago
- CI/CD pipelines - How code automatically goes from your laptop to production in minutes (not weeks!)
- GitHub Actions - How to write your first automated deployment pipeline
- Infrastructure as Code - How to create entire cloud environments with just a text file
- Deployment strategies - Blue-Green, Canary, Rolling deploys (and when to use each)
- Real-world costs - Why bad deployments cost $1 million per hour
Introduction: What is DevOps? (Start Here if You’re New)
The Problem Before DevOps
Traditional software deployment (The slow, painful way):- Knight Capital Group (2012)
- Manual deployment error
- Lost $440 million in 45 minutes
- Company went bankrupt
The Solution: DevOps + Automation
Modern deployment with DevOps:| Metric | Before DevOps | After DevOps | Improvement |
|---|---|---|---|
| Deployment Frequency | Once/month | 100x/day | 3,000x faster |
| Deployment Time | 6 weeks | 12 minutes | 6,000x faster |
| Deployment Failures | 30% | 0.1% | 300x more reliable |
| Mean Time to Recovery | 4 hours | 5 minutes | 48x faster |
What is DevOps? (Simple Explanation)
DevOps = Development + Operations working together using automation Think of it like a factory assembly line: Before DevOps (Manual process):- Automation replaces manual steps
- Faster, more reliable, cheaper
- Developers and Operations use the same tools/processes
What is CI/CD? (Breaking It Down)
CI/CD = Continuous Integration + Continuous Deployment 1. Continuous Integration (CI) = “Merge code often, test automatically” The Old Way (Integration Hell):The Cost of NOT Using DevOps
Real-World Disasters: Example 1: Healthcare.gov Launch (2013)- Problem: Manual deployment, no automation, no testing
- Result: Website crashed on day 1, couldn’t handle 250 users
- Cost: $1.7 billion to fix
- Prevention cost: ~$10 million (proper DevOps/CI/CD)
- ROI of DevOps: 170x
- Problem: Manual data center migration, poor testing
- Result: 75,000 passengers stranded, 726 flights canceled
- Cost: $170 million
- Prevention cost: ~$1 million (automated testing + deployment)
- ROI of DevOps: 170x
- Problem: Deployment bug in new feature
- With DevOps: Detected in 2 minutes, rolled back automatically
- Cost: $4 million in lost sales
- Without DevOps (if took 1 hour): $120 million in lost sales
- Small business: $8,500/hour
- Medium business: $74,000/hour
- Enterprise: $1,000,000+/hour
How CI/CD Actually Works (Behind the Scenes)
Real-World Example: Deploying a Node.js web app Step-by-Step Pipeline:What is DevOps? DevOps is a combination of “Development” and “Operations.” It’s a culture and set of practices that brings together software development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software faster. What is CI/CD?
- CI (Continuous Integration): Automatically build and test code every time someone commits changes. Catches bugs early.
- CD (Continuous Deployment/Delivery): Automatically deploy code to production after tests pass. Reduces manual errors.
- Before DevOps: Developers write code → Hand it to operations → Operations manually deploy → Takes days/weeks → Errors happen
- With DevOps: Developers commit code → Pipeline automatically builds, tests, deploys → Takes minutes → Fewer errors
- Old Way: Like building a house. You build it, then call inspectors, then fix issues, then move in (weeks/months).
- DevOps Way: Like a factory assembly line. Each step is automated, tested, and quality-checked automatically (hours/minutes).
1. Understanding CI/CD Pipelines
What is a Pipeline? A pipeline is a series of automated steps that take your code from source control to production. Think of it as a recipe: each step (ingredient) must complete successfully before moving to the next. Pipeline Stages:- Speed: Deploy in minutes instead of hours/days
- Consistency: Same process every time (no human error)
- Quality: Tests run automatically (catch bugs before production)
- Traceability: See exactly what was deployed and when
Azure DevOps vs GitHub Actions
Which Should You Choose?| Feature | Azure DevOps | GitHub Actions |
|---|---|---|
| Source Control | Azure Repos | GitHub |
| CI/CD | Azure Pipelines | GitHub Actions |
| Free Tier | 1,800 min/month | 2,000 min/month |
| Self-Hosted | Yes | Yes |
| Marketplace | Extensions | Actions Marketplace |
| Best For | Enterprise, Microsoft stack | Open source, GitHub-first |
[!WARNING] Gotcha: Secrets in Logs Never print environment variables or secrets to the console for debugging. Once a secret is in the build logs, it is compromised forever. Use Azure Key Vault to inject secrets at runtime without exposing them.
[!TIP] Jargon Alert: Idempotency A fancy word for “safe to run twice.” Good Infrastructure as Code (IaC) is idempotent: if you deploy the same Bicep file 100 times, nothing changes after the first time. Bad scripts create 100 duplicate resources.
2. GitHub Actions: Complete Guide
What is GitHub Actions? GitHub Actions is a CI/CD platform built into GitHub. You define workflows (pipelines) using YAML files in your repository. When you push code, GitHub automatically runs your workflows. Why GitHub Actions?- ✅ Free for public repositories
- ✅ 2,000 free minutes/month for private repos
- ✅ Integrated with GitHub (no separate tool)
- ✅ Huge marketplace of pre-built actions
- ✅ Easy to get started
[!WARNING] Gotcha: GitHub Actions Minutes Can Run Out Free tier gives 2,000 minutes/month. If your workflows run frequently or take long, you’ll hit the limit. Self-hosted runners are free (unlimited), but you manage the infrastructure. Monitor your usage in Settings → Billing.
[!TIP]
Jargon Alert: Workflow vs Action
Workflow: The entire pipeline (the YAML file). Defines when to run and what jobs to execute.
Action: A reusable step (like actions/checkout@v3). Think of it as a pre-built function you can call. Actions are published to the GitHub Marketplace.
[!INFO] Aside: GitHub Actions Pricing
- Public repos: Unlimited free minutes
- Private repos: 2,000 free minutes/month, then $0.008/minute
- Self-hosted runners: Free (unlimited), but you pay for the VM/infrastructure
Step-by-Step: Creating Your First GitHub Actions Workflow
Let’s create a complete CI/CD pipeline from scratch:Step 1: Create Workflow File
Where? Create.github/workflows/ directory in your repository root.
Step 2: Basic Workflow Structure
- name
- on
- jobs
- steps
What: Name of your workflow (shows in GitHub Actions tab)Example:
name: CI/CD PipelineWhy: Helps identify workflows when you have multipleStep 3: Complete CI/CD Workflow Example
Here’s a complete workflow that builds, tests, and deploys a Node.js app to Azure App Service:- Triggers: Runs on push to main or manual trigger
- Build Job:
- Checks out code
- Installs dependencies (with caching)
- Runs linter
- Runs tests
- Builds application
- Uploads build artifacts
- Scans for security vulnerabilities
- Deploy Job (only if build succeeds):
- Downloads build artifacts
- Logs into Azure
- Deploys to App Service
Step 4: Configure Azure Credentials
What are Secrets? Sensitive data (passwords, API keys) stored securely in GitHub. Never commit secrets to code! How to Set Up Azure Credentials:- Create Service Principal in Azure:
- Add Secret to GitHub:
- Go to your GitHub repository
- Settings → Secrets and variables → Actions
- Click “New repository secret”
- Name:
AZURE_CREDENTIALS - Value: Paste the JSON from step 1
- Click “Add secret”
Step 5: Understanding Workflow Features
- Matrix Strategy (Multiple Versions)
- Conditional Steps
- Environment Variables
- Secrets
What: Run the same job with different configurations (e.g., test on Node 16, 18, 20)Example:Result: Runs 6 jobs (3 Node versions × 2 OSes) in parallel
Step 6: Advanced: Multi-Environment Deployment
Deploy to Dev → Staging → Production:3. Azure Pipelines (Alternative to GitHub Actions)
When to Use Azure Pipelines:- Your organization already uses Azure DevOps
- Need advanced enterprise features (test plans, work items)
- Want integrated project management
YAML Pipeline Example
3. Infrastructure as Code
Bicep Example
Deploy with Azure CLI
4. GitOps
Use Git as single source of truth for infrastructure and applications.5. Deployment Strategies
Choosing the right deployment strategy can be the difference between a seamless release and a production outage.Strategy Comparison
| Strategy | Risk | Rollback Speed | Cost | Complexity | Best For |
|---|---|---|---|---|---|
| Blue-Green | Low | Instant | High (2x infra) | Low | Critical systems, databases |
| Canary | Very Low | Fast (minutes) | Medium | High | User-facing apps, A/B testing |
| Rolling | Medium | Slow (gradual) | Low | Medium | Stateless apps, microservices |
| Feature Flags | Minimal | Instant | Low | Medium | SaaS, gradual rollouts |
Blue-Green Deployment
Deploy to an identical “green” environment while “blue” runs production. Swap traffic instantly.Azure Implementation
Using Azure App Service Deployment Slots:- Auto-Swap: Automatically swap after successful deployment.
- Swap with Preview: Test in production config before committing.
- Instant Rollback: If issues detected, swap back in <5 seconds.
[!WARNING] Gotcha: Database Migrations Blue-green works great for stateless apps, but database schema changes are tricky. Both blue and green must support the current schema. Use backward-compatible migrations (add columns, don’t drop).Real-World Example: A bank deploys during business hours using blue-green. New code goes to green, runs smoke tests, then swaps. If fraud detection service fails, they swap back in 3 seconds—no customer impact.
Canary Deployment
Route a small percentage of traffic (5%) to the new version. If metrics look good, gradually increase to 100%.AKS with Flagger
- Deploy new version (
myapp-v2). - Flagger routes 10% of traffic to v2.
- Wait 1 minute, check metrics (success rate >99%, latency <500ms).
- If healthy, increase to 20%, then 30%, etc.
- If metrics fail, automatic rollback to v1.
[!TIP] Best Practice: Canary Metrics Don’t just monitor HTTP 500s. Track business metrics like “checkout completion rate” or “login success rate.” A technically healthy service might still break user workflows.
Rolling Deployment
Update instances one at a time. If any instance fails, stop the rollout.AKS Rolling Update
- Kubernetes creates 2 new pods (v2) while 10 old pods (v1) still run.
- Waits for new pods to pass readiness probe.
- Terminates 1 old pod, creates 1 new pod.
- Repeats until all 10 pods are running v2.
[!WARNING] Gotcha: PodDisruptionBudget Without a PDB, Kubernetes might terminate too many pods during a node upgrade, causing an outage. Always set:
Feature Flags (Feature Toggles)
Deploy code with new features disabled. Enable features gradually via configuration.Azure App Configuration + Feature Flags
- Enable for internal employees (100%).
- Enable for beta users (100%).
- Enable for 10% of general users.
- Monitor metrics for 24 hours.
- Increase to 50%, then 100%.
- Remove feature flag from code after 2 weeks.
[!TIP]
Best Practice: Feature Flag Lifecycle
Feature flags are temporary. Never let them live forever. Set a TTL (time-to-live) and remove the flag once the feature is stable. Otherwise, your code becomes a graveyard of if (featureFlag) checks.
Comparison: Real-World Scenario
Scenario: Deploying a payment processing service update.| Concern | Blue-Green | Canary | Rolling | Feature Flags |
|---|---|---|---|---|
| Zero Downtime | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
| Instant Rollback | ✅ <5 sec | ⚠️ 5-10 min | ❌ 10-20 min | ✅ <1 sec |
| Cost | ❌ 2x infra | ⚠️ 1.2x infra | ✅ 1x infra | ✅ 1x infra |
| Database Migrations | ⚠️ Need backward compat | ✅ Can test schema | ❌ Risky | ✅ Isolate changes |
| A/B Testing | ❌ All or nothing | ✅ Gradual rollout | ❌ All or nothing | ✅ Per-user targeting |
| Recommendation | ❌ Too expensive | ✅ Best choice | ⚠️ Use with caution | ✅ Combine with Canary |
Decision Flowchart
6. Interview Questions
Beginner Level
Q1: What is CI/CD?
Q1: What is CI/CD?
Answer:
- CI (Continuous Integration): Automating the build and testing of code every time a team member commits changes to version control.
- CD (Continuous Deployment/Delivery): Automating the release of validated code to a repository or production environment.
Q2: What is the difference between Azure DevOps and GitHub Actions?
Q2: What is the difference between Azure DevOps and GitHub Actions?
Answer:
- Azure DevOps: Complete suite (Boards, Repos, Pipelines, Test Plans, Artifacts). Great for enterprise management and tracking.
- GitHub Actions: Workflow automation engine built into GitHub. Closer to the code, massive open-source community, simpler for CI.
Intermediate Level
Q3: Explain the concept of Infrastructure as Code (IaC)
Q3: Explain the concept of Infrastructure as Code (IaC)
Answer:
Managing and provisioning infrastructure through code (Bicep/Terraform) rather than manual processes.
Benefits:
- Consistency: Same environment every time.
- Version Control: Track history of changes.
- Speed: Deploy entire environments in minutes.
- Disaster Recovery: Re-create environments from scratch easily.
Q4: What is a Self-Hosted Agent?
Q4: What is a Self-Hosted Agent?
Answer:
A machine that you set up and manage to run pipeline jobs.
Use Cases:
- Build needs access to private resources (VNet).
- Specialized software/hardware requirements.
- Caching large dependencies (faster builds).
Advanced Level
Q5: How do you implement a secure supply chain?
Q5: How do you implement a secure supply chain?
Answer:
- Dependency Scanning: Check NuGet/NPM packages for vulnerabilities (GitHub Dependabot).
- Secret Scanning: Detect committed credentials.
- Container Scanning: Scan Docker images for CVEs (Trivy/Defender).
- Code Signing: Sign build artifacts to ensure integrity.
- Least Privilege: Pipeline service connections should have minimal permissions.
6. Key Takeaways
Automate Everything
If you do it twice, automate it. Manual deployments are forbidden in production.
Infrastructure as Code
Treat infrastructure like software. Use Bicep or Terraform for reproducible environments.
Shift Left
Test security and quality early in the pipeline, not after deployment.
GitOps
Git is the single source of truth. Deployment reflects the state of the main branch.
Ephemerality
Build agents and environments should be disposable. Don’t rely on snowflake servers.
Next Steps
Continue to Chapter 12
Master Azure cost optimization and FinOps strategies