Identity & Access Management
Identity is the new security perimeter. In the cloud, your identity system is more important than your firewall. Master Azure Active Directory (Azure AD) and you master Azure security.What You’ll Learn
By the end of this chapter, you’ll understand:- What identity and access management actually means (from absolute basics)
- Why identity is the “new perimeter” in cloud security
- How authentication and authorization work (with real examples)
- Azure Active Directory fundamentals and architecture
- Role-Based Access Control (RBAC) from scratch
- Multi-Factor Authentication (MFA) and conditional access
- When and how to use different authentication methods
Introduction: What is Identity & Access Management? (Start Here if You’re New)
The Simple Explanation
Identity & Access Management (IAM) = Controlling WHO can do WHAT Think of it like your house:- Identity = Keys (prove who you are)
- Access Management = Locks on different rooms (control what you can access)
Real-World Analogy: Office Building Security
Before entering an office building:- Your username/password = Employee badge (proves identity)
- Azure AD = Security system (manages identities)
- RBAC = Floor/room access rules (manages permissions)
Why Does Identity Matter in the Cloud?
Old Security Model (2005): Castle and MoatAuthentication vs Authorization (The Most Confused Concepts)
People mix these up constantly. Here’s the clear difference: Authentication = Proving WHO you areThe Problem IAM Solves
Scenario: Company with 100 employees using Azure Without Proper IAM (Chaos):Cost of Poor IAM: Real Breach Example
Capital One Data Breach (2019):The Four Key Questions of IAM
Every IAM system answers these questions: 1. Who are you? (Authentication)Identity in Azure: The Big Picture
Common Mistakes Beginners Make
❌ Mistake 1: Sharing accountsDecision Tree: Choosing Authentication Methods
1. Azure Active Directory Deep Dive
Azure Active Directory (Azure AD) is Microsoft’s cloud-based identity and access management service. It’s NOT the same as on-premises Active Directory.Azure AD vs Active Directory (AD DS)
| Feature | Azure AD | Active Directory (AD DS) |
|---|---|---|
| Protocol | SAML, OAuth 2.0, OpenID Connect | Kerberos, LDAP |
| Structure | Flat (no OUs or GPOs) | Hierarchical (OUs, GPOs) |
| Authentication | Cloud-native, MFA-enabled | On-premises, password-based |
| Management | REST API, PowerShell, Portal | ADUC, Group Policy |
| Use Case | Cloud applications, SaaS | On-premises domain services |
| Federation | Built-in SSO | Requires AD FS |
[!TIP] Jargon Alert: Service Principal Think of a Service Principal as a “user account for an application.” Just like you have a username/password, an app needs an identity to log in. A Managed Identity is just a Service Principal that Azure creates and rotates the password for automatically. (Always use Managed Identities when possible!)
Azure AD Architecture
Azure AD Editions
- Free
- User and group management
- Basic security reports
- Single Sign-On (SSO) to Azure, Office 365
- Self-service password change
- Device registration
2. Authentication Protocols
Understanding how Azure AD authenticates users is crucial for troubleshooting and designing secure applications.OAuth 2.0 & OpenID Connect: The Pro’s Perspective
While basic diagrams show the flow, a cloud engineer needs to understand what is inside the envelope.The Token Trinity
Azure AD issues three distinct types of tokens. Using the wrong one is a common security failure.| Token | Purpose | Handled By | Typical Lifetime |
|---|---|---|---|
| ID Token | Proves identity (OIDC) | Your Application | 1 hour |
| Access Token | Authorizes access | The Resource (API) | 1 hour |
| Refresh Token | Gets new tokens | Your Application | 90 days (rolling) |
[!WARNING] Security Gotcha: ID Token vs. Access Token Never use an ID Token to authorize API calls. ID tokens are meant for your app to know “who logged in.” Access tokens are meant for the API to know “what this user can do.” If you send an ID token to a backend, you’re bypassing authorization checks.
Anatomy of a JWT Token
Both ID and Access tokens are JWT (JSON Web Tokens). They are Base64 encoded, not encrypted. Anyone who has the token can read it (usingjwt.ms).
Key Claims inside an Azure AD Token:
aud(Audience): The intended recipient of the token. If this doesn’t match your app/API ID, reject it.scp(Scope): The permissions granted (e.g.,User.Read).roles: The RBAC roles assigned to the user.exp(Expiry): When the token dies.nonce: Prevents replay attacks (essential for security).
OpenID Connect (OIDC) flow in Entra ID
[!TIP] Pro Tip: Managed Identities If your application runs inside Azure (e.g., on a VM or App Service), use Managed Identities. You don’t have to manageclient_idorclient_secretvariables. Azure handles the token exchange securely in the background, eliminating the risk of leaking secrets in your code.
SAML 2.0
SAML is the older protocol, still widely used for enterprise SSO.- When to use SAML
- When to use OIDC/OAuth
- Enterprise SSO to legacy apps
- Applications that don’t support OAuth/OIDC
- Compliance requirements (some regulations mandate SAML)
3. Multi-Factor Authentication (MFA)
MFA is non-negotiable. Passwords alone are compromised in 80% of breaches.MFA Methods in Azure AD
Microsoft Authenticator
- Passwordless authentication supported
- Number matching (prevents MFA fatigue)
- Risk-based authentication
FIDO2 Security Keys
- Phishing-resistant
- Passwordless
- Hardware-based cryptography
SMS / Voice Call
- Vulnerable to SIM swapping
- SMS interception
- Use only as backup
OATH Hardware Tokens
- YubiKey, RSA tokens
- Time-based codes
- No network required
Conditional Access + MFA
Don’t require MFA for everyone, everywhere. Use Conditional Access for smart MFA:4. Role-Based Access Control (RBAC)
RBAC is how you control who can do what in Azure.The RBAC Formula
[!WARNING] Gotcha: RBAC Propagation Delay After assigning an RBAC role, it can take up to 30 minutes for permissions to propagate across all Azure regions. If a user says “I still can’t access it,” they might just need to wait a few minutes and refresh their token (re-login).
Built-in Roles Deep Dive
Owner
Owner
- All actions:
* - Can grant access to others
- Can delete resources
Contributor
Contributor
- All resource operations
- CANNOT assign roles
- CANNOT modify IAM
Reader
Reader
- Read all resources
- CANNOT modify anything
- CANNOT view secrets/keys
Specialized Roles
Specialized Roles
- Virtual Machine Contributor
- Virtual Machine Administrator Login
- Disk Snapshot Contributor
- Storage Blob Data Contributor
- Storage Blob Data Reader
- Storage Queue Data Contributor
- Network Contributor
- DNS Zone Contributor
- Key Vault Administrator
- Security Admin
- Security Reader
Scope Hierarchy & Inheritance
- Permissions accumulate downward
- Child inherits parent permissions
- Cannot remove inherited permissions (only add)
- Explicit Deny doesn’t exist in Azure RBAC (unlike AWS)
RBAC vs. ABAC: Moving Beyond Roles
While RBAC is great for 90% of scenarios, large-scale enterprises often hit a “Role Explosion” problem—creating hundreds of custom roles for every tiny variation. This is where Attribute-Based Access Control (ABAC) comes in.| Feature | RBAC (Role-Based) | ABAC (Attribute-Based) |
|---|---|---|
| Decision Factor | Who you are (Role) | Key-Value pairs (Attributes/Tags) |
| Logic | ”Bob is a Developer" | "User has Project=X” AND “Resource has Project=X” |
| Flexibility | Static | Dynamic |
| Use Case | Most Azure environments | Large-scale micro-segmentation |
How ABAC Works in Azure (Role Assignment Conditions)
In Azure, ABAC is implemented as conditions added to a standard Role Assignment. Example Scenario: You want to allow developers to delete blobs in Storage, but ONLY if the blob has a tagProject=Phoenix.
RBAC Way: Create a custom role for Project Phoenix? No, that’s brittle.
ABAC Way:
- Assign
Storage Blob Data Contributorto the Developer group. - Add an ABAC Condition:
(Target Resource Tag 'Project' EQUAL TO 'Phoenix')
[!IMPORTANT] Pro Tip: ABAC for Dev/Test ABAC is extremely powerful for “Sandboxing.” You can grant a user Contributor access to a whole subscription, but add an ABAC condition that they can only modify resources with a tag matching their own Username. This creates a “dynamic sandbox” without managing thousands of separate resource groups.
Custom RBAC Roles
When built-in roles don’t fit, create custom roles. Example: Database Administrator Role- Start with built-in role, copy and modify
- Use least privilege (only required actions)
- Test in dev before production
- Document the purpose and use cases
- Review quarterly (remove unused permissions)
5. Managed Identities
Managed Identities eliminate the need for credentials in code. This is huge for security.The Problem Managed Identities Solve
- ❌ Without Managed Identity
- ✅ With Managed Identity
- ❌ Secret stored somewhere (code, config, Key Vault)
- ❌ Secret can be stolen (Git commits, logs)
- ❌ Must rotate secrets (manual process)
- ❌ Secrets expire (app breaks)
- ❌ Overhead managing secrets
Types of Managed Identities
System-Assigned
- Created when resource is created
- Deleted when resource is deleted
- One-to-one relationship
- Cannot be shared
User-Assigned
- Created separately
- Can be shared across resources
- Survives resource deletion
- Reusable
Hands-On: Enable Managed Identity
Scenario: VM needs to read from Key Vault Step 1: Enable Managed Identity on VMHow Managed Identity Works Internally
- Special endpoint:
http://169.254.169.254 - Only accessible from within Azure resources
- Provides metadata + OAuth tokens
- No authentication required (network isolation provides security)
6. Conditional Access
Conditional Access is Azure AD’s policy engine for enforcing security controls based on conditions.Conditional Access Components
- Conditions
- Controls
- User/Group: Specific users or groups
- Cloud App: Which app is being accessed
- Device Platform: Windows, iOS, Android, macOS
- Location: IP ranges, countries
- Client App: Browser, mobile app, legacy auth
- Sign-in Risk: Low, Medium, High (from Identity Protection)
- Device State: Compliant, domain-joined, hybrid-joined
Essential Conditional Access Policies
Policy 1: Require MFA for All Users
Policy 1: Require MFA for All Users
Policy 2: Block Legacy Authentication
Policy 2: Block Legacy Authentication
Policy 3: Require Compliant Device for Admins
Policy 3: Require Compliant Device for Admins
Policy 4: Block Access from Unknown Locations
Policy 4: Block Access from Unknown Locations
Policy 5: Require MFA for Azure Management
Policy 5: Require MFA for Azure Management
Conditional Access Best Practices
Start with Report-Only Mode
Create Break-Glass Accounts
Layer Policies
7. Privileged Identity Management (PIM)
PIM provides just-in-time (JIT) privileged access. Admins are elevated only when needed, for a limited time.The Problem PIM Solves
- ❌ Without PIM
- ✅ With PIM
PIM Configuration
Step 1: Make Users Eligible (Not Active)PIM Access Reviews
Quarterly Access Review:8. Azure AD Connect (Hybrid Identity)
Most enterprises have on-premises Active Directory. Azure AD Connect synchronizes identities to Azure AD.Synchronization Methods
- Password Hash Sync
- Pass-through Authentication
- Federation (ADFS)
- Azure AD Connect runs on-premises
- Reads user objects from AD DS
- Hashes the password hash (double hash)
- Syncs to Azure AD
- Users can sign in to cloud apps with AD password
- Simple to set up
- No additional infrastructure
- Works even if on-premises AD is down
- Supports leaked credential detection
Azure AD Connect Best Practices
High Availability
- Deploy Azure AD Connect in staging mode (second server)
- Automatic failover if primary fails
- Regular backups of config
Selective Sync
- Don’t sync all AD objects
- Filter by OU (sync only users who need cloud access)
- Reduces Azure AD clutter
Monitoring
- Azure AD Connect Health (monitors sync)
- Alerts on sync errors
- Export sync logs to Log Analytics
Password Writeback
- Allow self-service password reset (SSPR)
- Changes in Azure AD written back to AD DS
- Requires P1 license
9. Hands-On Lab: Implement Zero-Trust Identity
Let’s implement a complete zero-trust identity architecture.Lab Objectives
- Configure Azure AD with MFA
- Implement Conditional Access policies
- Configure PIM for just-in-time access
- Set up managed identity for app authentication
Step 1: Enable Security Defaults (Quick Win)
- Security Defaults: Free, basic protection, all-or-nothing
- Conditional Access: Premium P1, granular policies, recommended for production
Step 2: Create Conditional Access Policies
Policy 1: Require MFA for Azure PortalStep 3: Configure PIM
Step 4: Create Managed Identity Application
Step 5: Test Managed Identity
Application Code (ASP.NET Core):Step 6: Verify Zero-Trust Architecture
Test Managed Identity
/secret endpoint. Verify app can read Key Vault without credentials in code.Security Best Practices: The Principal Engineer’s Checklist
If you are designing high-stakes infrastructure, these are the “non-negotiable” rules of the road.1. Zero Trust: “Never Trust, Always Verify”
In a Zero Trust architecture, we assume the internal network is already compromised.- Micro-segmentation: Don’t just rely on one big VNet. Use Subnets and NSGs to isolate layers.
- Explicit Verification: Every access request must be authenticated, authorized, and validated for risk.
- Least Privilege: Grant the absolute minimum access required. If a developer only needs to READ logs, don’t give them Contributor.
2. Conditional Access “Gotchas”
- The Catch-22: Be careful not to create a policy that blocks YOU from the portal. Always exclude at least one “Emergency Access” account (more on this below).
- Report-Only Mode: Always test new policies in “Report-Only” mode for a week before enforcing them. Check the logs to see who would have been blocked.
- Location Spoofing: Remember that VPNs and Tor can be used to bypass location-based policies. Use “Sign-in Risk” as a secondary check.
3. Identity vs. Network Security
Identity is the new perimeter, but network security is still important.- Private Link: Use Private Link to ensure your database is never exposed to the public internet, even if someone steals an identity token.
- Bastion: Don’t open RDP/SSH ports (3389/22) to the internet. Use Azure Bastion for secure management access.
Troubleshooting: “I’m Locked Out of My Own Subscription”
It happens to the best of us. A misconfigured Conditional Access policy or a lost MFA device can lock you out. Here is the pro’s recovery plan.Scenario: The MFA Lockout
The Problem: You lost your phone, and you are the only Global Admin. The Solution:- The Break-Glass Account: Every tenant should have 1-2 “Emergency Access” accounts that are excluded from all Conditional Access policies and have a long, complex password stored in a physical safe.
- Microsoft Support: If you don’t have a break-glass account, you’ll need to call Microsoft. Prepare for a long identity verification process involving your corporate documents.
Scenario: The Subscription-Level Lockout (RBAC)
The Problem: Someone accidentally removed all “Owner” assignments from the subscription. The Solution:- An Account Admin (the person who pays the bill) can elevate themselves to access the subscription via the Azure Enterprise Portal or by contacting support to reset the “Elevate Access” toggle in Entra ID properties.
[!CAUTION] Warning: The “Elevate Access” Toggle A Global Admin can toggle a setting in Entra ID to grant themselves “User Access Administrator” at the Root (/) scope. This allows them to see and fix every subscription in the tenant. This is a massive security risk and should only be used in true emergencies, then immediately toggled off and audited.
10. Interview Questions
Beginner Level
Q1: What's the difference between Azure AD and Active Directory?
Q1: What's the difference between Azure AD and Active Directory?
- Protocol: OAuth 2.0, OIDC, SAML
- Structure: Flat (no OUs)
- Use: Cloud apps, SaaS, Azure resources
- Management: REST API, Graph API
- Protocol: Kerberos, LDAP
- Structure: Hierarchical (OUs, GPOs)
- Use: Domain services, on-prem apps
- Management: ADUC, Group Policy
Q2: Why use managed identities instead of service principals?
Q2: Why use managed identities instead of service principals?
- ✅ No credentials to manage
- ✅ Automatic rotation by Azure
- ✅ Cannot be stolen from code
- ✅ Lifecycle tied to resource
- ❌ Secret stored somewhere (Key Vault, config)
- ❌ Must manually rotate (every 90 days)
- ❌ Can be leaked (Git, logs)
- ❌ Overhead to manage
Q3: What is the purpose of Conditional Access?
Q3: What is the purpose of Conditional Access?
- Better user experience (less MFA fatigue)
- Stronger security (context-aware)
- Granular control (per app, user, location)
Intermediate Level
Q4: Design a secure identity architecture for a healthcare company
Q4: Design a secure identity architecture for a healthcare company
Q5: How do you handle a compromised admin account?
Q5: How do you handle a compromised admin account?
Advanced Level
[!NOTE] Advanced interview questions for Identity Architecture are being updated to reflect the latest Zero Trust standards. Check back soon!
11. Key Takeaways
Identity is the Perimeter
Managed Identities
Least Privilege
Conditional Access
Monitor Everything
Zero Trust
Interview Deep-Dive
Your company has 500 engineers. Some need production access for on-call, but a developer accidentally deleted a production database. Design an access control strategy.
Your company has 500 engineers. Some need production access for on-call, but a developer accidentally deleted a production database. Design an access control strategy.
- Layer 1 — RBAC with least privilege: No engineer gets permanent Contributor or Owner on production resource groups. Everyone gets Reader by default on production so they can view dashboards and logs for debugging. Developer teams get Contributor only on their dev/staging subscriptions.
- Layer 2 — PIM for just-in-time access: On-call engineers activate a time-limited (4-hour maximum) Contributor role on production through Azure AD PIM. This requires MFA, a justification reason, and optionally manager approval. The activation is logged and auditable. When the window expires, access is automatically revoked.
- Layer 3 — Custom roles to prevent catastrophic actions: Instead of granting full Contributor, I would create a custom RBAC role called “Production Operator” that includes read/write permissions but explicitly excludes delete actions on databases and resource groups. Even with elevated access, an engineer cannot delete the database.
- Layer 4 — Azure Resource Locks: Put a CanNotDelete lock on every production database and resource group. Even an Owner cannot delete a locked resource without first removing the lock, which creates a separate audit trail.
- Real-world example: After implementing this pattern at a fintech company, accidental production deletions dropped from 3-4 per quarter to zero over 18 months. The PIM activation logs also satisfied SOC 2 Type II auditors.
Explain the difference between a Service Principal and a Managed Identity. When would you use one over the other?
Explain the difference between a Service Principal and a Managed Identity. When would you use one over the other?
- Service Principal: An identity you create manually in Entra ID for applications or CI/CD pipelines. It has a client ID and either a client secret or a certificate. You are responsible for rotating the secret before it expires. If the secret leaks, an attacker can impersonate that application.
- Managed Identity: An identity that Azure creates and manages automatically for Azure resources. There is no secret to manage — Azure handles credential rotation internally using the Instance Metadata Service (IMDS). The token is injected into the resource automatically.
- When to use Managed Identity (always prefer this): Any Azure resource talking to another Azure resource. App Service accessing Key Vault, AKS pods reading from Blob Storage, Azure Functions writing to Cosmos DB. I use system-assigned for single-purpose resources and user-assigned when multiple resources need the same identity.
- When Service Principal is unavoidable: External CI/CD systems (GitHub Actions outside Azure), third-party SaaS applications, on-premises applications connecting to Azure, and multi-tenant scenarios. The mitigation for GitHub is OIDC federation, which eliminates secrets entirely.
- The security gap people miss: Service Principal secrets stored in CI/CD pipeline variables are the number one credential leak vector. I have seen GitHub Actions workflows with Azure credentials committed to public repositories.
Design a Conditional Access policy strategy for a company with 2,000 employees across 5 countries, including contractors on personal devices.
Design a Conditional Access policy strategy for a company with 2,000 employees across 5 countries, including contractors on personal devices.
- Baseline policy (applies to everyone): Require MFA for all users on all cloud applications. No exceptions. Microsoft’s research shows MFA blocks 99.9% of account compromise attacks. Use the Authenticator push notification which takes 2 seconds.
- Location-based policies: Create named locations for each office. From trusted office IPs, allow compliant devices with 12-hour session lifetime. From untrusted locations, require MFA on every sign-in with 1-hour sessions. Block sign-ins from countries where the company has no employees.
- Device-based policies for contractors: Contractors on unmanaged devices get: MFA required, app-enforced restrictions (read-only in SharePoint, no downloads), 4-hour session limit, and Conditional Access App Control for inline session monitoring.
- Risk-based policies: Enable Identity Protection. If Entra ID detects risky sign-in (impossible travel, anonymous IP, password spray), automatically require MFA re-authentication or block for high-risk signals. Force password reset if credentials appear in a known breach database.
- The cost consideration: Conditional Access requires Entra ID P1 (12,000/month. Compared to an average breach cost of $4.45M, the ROI is clear.