Runbook: Identity Provider Setup
Overview
This runbook covers configuring identity providers for Cloud Aegis authentication, including:
- Okta OIDC setup and configuration
- Microsoft Entra ID setup and configuration
- JWT validation configuration
- Mock provider for development
- Troubleshooting authentication issues
Prerequisites
- Admin access to Okta Admin Console or Azure Portal (Entra ID)
- kubectl access to the Cloud Aegis cluster
- Cloud Aegis configuration access (configmap or environment variables)
Architecture
Cloud Aegis uses config-driven identity provider selection:
OKTA_DOMAIN set --> Okta provider activated
ENTRA_TENANT_ID set --> Entra ID provider activated
Neither set --> Mock provider (development mode)
The server stores active providers in Server.identityProviders (map[string]identity.Provider). The JWT auth middleware validates tokens against the configured provider's JWKS endpoint.
Relevant code:
cmd/server/main.go— Provider initializationinternal/identity/provider.go— Provider interfaceinternal/identity/okta.go— Okta implementationinternal/identity/entra_id.go— Entra ID implementationinternal/identity/mock.go— Mock provider for development
Okta OIDC Setup
Step 1: Create Okta Application
-
Go to Okta Admin Console > Applications > Create App Integration
-
Select: OIDC - OpenID Connect
-
Application type: Web Application
-
Settings:
- App name:
Cloud Aegis - Grant type: Authorization Code
- Sign-in redirect URIs:
https://app.aegis.io/callback - Sign-out redirect URIs:
https://app.aegis.io - Controlled access: Limit to specific groups
- App name:
-
Note the following values:
- Client ID
- Client Secret
- Okta Domain (e.g.,
dev-12345.okta.com)
Step 2: Configure Groups and Roles
Map Okta groups to Cloud Aegis roles:
| Okta Group | Cloud Aegis Role |
|---|---|
aegis-admin | admin |
aegis-operator | operator |
aegis-requester | requester |
Configure group claim in Okta:
- Security > API > Authorization Servers > default
- Claims > Add Claim:
- Name:
groups - Include in token type: ID Token (Always)
- Value type: Groups
- Filter: Starts with
aegis-
- Name:
Step 3: Configure Cloud Aegis
# Set environment variables
kubectl set env deployment/aegis-api -n aegis \
OKTA_DOMAIN="dev-12345.okta.com" \
OKTA_CLIENT_ID="0oa..." \
OKTA_CLIENT_SECRET="xxx"
# Or update configmap
kubectl edit configmap aegis-config -n aegis
# Add:
# identity:
# okta:
# domain: "dev-12345.okta.com"
# client_id: "0oa..."
# client_secret: "xxx"
# Restart to pick up changes
kubectl rollout restart deployment/aegis-api -n aegis
kubectl rollout status deployment/aegis-api -n aegis
Step 4: Verify
# Check health endpoint
curl -sf https://api.aegis.io/health | jq '.components.identity_provider'
# Expected: {"okta": "ok"}
# Test token validation
curl -sf https://api.aegis.io/api/v1/findings \
-H "Authorization: Bearer $OKTA_TOKEN" | jq '.total'
Microsoft Entra ID Setup
Step 1: Register Application
-
Azure Portal > Entra ID > App registrations > New registration
-
Settings:
- Name:
Cloud Aegis - Supported account types: Single tenant (or multi-tenant for MSP)
- Redirect URI: Web >
https://app.aegis.io/callback
- Name:
-
Note: Application (client) ID, Directory (tenant) ID
Step 2: Configure Authentication
-
Authentication > Add platform > Web
- Redirect URIs:
https://app.aegis.io/callback - ID tokens: Check
- Access tokens: Check
- Redirect URIs:
-
Certificates & secrets > New client secret
- Note the secret value (store in Key Vault)
Step 3: Configure Token Claims
-
Token configuration > Add groups claim
- Group types: Security groups
- Customize token properties: Group ID
-
App roles > Create app role:
- Display name:
Cloud Aegis Admin - Value:
admin - Allowed member types: Users/Groups
Repeat for
operatorandrequester. - Display name:
Step 4: Configure Cloud Aegis
# Set environment variables
kubectl set env deployment/aegis-api -n aegis \
ENTRA_TENANT_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" \
ENTRA_CLIENT_ID="yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy" \
ENTRA_CLIENT_SECRET="zzz"
# Restart
kubectl rollout restart deployment/aegis-api -n aegis
kubectl rollout status deployment/aegis-api -n aegis
Step 5: Verify
# Check health endpoint
curl -sf https://api.aegis.io/health | jq '.components.identity_provider'
# Expected: {"entra_id": "ok"}
Development Mode (Mock Provider)
When neither OKTA_DOMAIN nor ENTRA_TENANT_ID is set, Cloud Aegis falls back to the mock provider.
In development mode:
AuthProvidercomponent auto-authenticates as adminProtectedRouteskips auth checks whenimport.meta.env.DEVis true- The dev JWT token is stored in
frontend/.env.development(gitignored) - The JWT signing secret is sourced from 1Password (
aegis-dev-jwt-secret)
# Verify mock mode
curl -sf http://localhost:8080/health | jq '.components.identity_provider'
# Expected: {"mock": "ok"}
# Use dev header override for role testing
curl -sf http://localhost:8080/api/v1/findings \
-H "X-Cloud Aegis-Role: operator"
JWT Validation Configuration
| Parameter | Description | Default |
|---|---|---|
JWT_SIGNING_KEY | HS256 symmetric key | Required (no default) |
JWT_ISSUER | Expected iss claim | aegis |
JWT_AUDIENCE | Expected aud claim | aegis-api |
JWKS_URL | JWKS endpoint for RS256 | Auto-configured from IdP |
JWKS_CACHE_TTL | JWKS cache duration | 1 hour |
Troubleshooting
Token Validation Fails
Symptoms: 401 Unauthorized on all API calls
Diagnosis:
kubectl logs -n aegis -l app=aegis-api --tail=100 | grep -i "jwt\|auth\|token"
Common causes:
- Expired token: Check
expclaim withjwt.io - Wrong issuer: Token
issdoesn't match configured issuer - Wrong audience: Token
auddoesn't match configured audience - JWKS unreachable: Network issue reaching IdP's JWKS endpoint
- Clock skew: Server time differs from IdP time by more than allowed leeway
Groups Claim Missing
Symptoms: Authenticated but 403 Forbidden (role not assigned)
Diagnosis:
# Decode token and check groups claim
echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq '.groups'
Resolution:
- Verify groups claim is configured in IdP (see setup steps above)
- Verify user is assigned to the correct group/role in IdP
- Check that group names match expected patterns (
aegis-admin, etc.)
JWKS Cache Stale
Symptoms: Tokens from one provider validate, but newly issued tokens fail
Diagnosis:
kubectl logs -n aegis -l app=aegis-api | grep "jwks\|key rotation"
Resolution:
# Force JWKS cache refresh by restarting pods
kubectl rollout restart deployment/aegis-api -n aegis
Escalation
| Condition | Action |
|---|---|
| All auth failing after IdP change | Check IdP status page, verify JWKS endpoint |
| Token validation intermittent | Check JWKS cache, verify clock sync |
| Role mapping incorrect | Review group claim configuration in IdP |
| Mock provider active in production | Immediately set OKTA_DOMAIN or ENTRA_TENANT_ID |
Contact Information
- On-Call: PagerDuty
- Identity Team: #identity-platform (Slack)
- Security Team: #security-ops (Slack)