Skip to main content
  1. Home
  2. >
  3. AWS
  4. >
  5. SAP-C02
  6. >
  7. This article

AWS SAP-C02 Drill: Hybrid Workload Migration - The SLA-Cost-Availability Trade-off Analysis

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | Multi-Cloud Architect & Strategist.
Jeff's Architecture Insights
Go beyond static exam dumps. Jeff’s Insights is engineered to cultivate the mindset of a Production-Ready Architect. We move past ‘correct answers’ to dissect the strategic trade-offs and multi-cloud patterns required to balance reliability, security, and TCO in mission-critical environments.

While preparing for the AWS SAP-C02, many candidates get confused by mixing workload criticality with instance purchasing models. In the real world, this is fundamentally a decision about SLA protection vs. cost optimization through intelligent capacity planning. Let’s drill into a simulated scenario.

The Scenario
#

GlobalAnalytics Inc. operates a business intelligence platform currently hosted in their corporate data center using 12 fully-redundant servers configured for high availability. The system executes two distinct workload types:

  • Scheduled Analytics Jobs: Hourly and daily batch processes consuming 65% of total compute capacity. These jobs have strict SLA requirements and typical execution windows ranging from 20 minutes to 2 hours.
  • Ad-Hoc User Queries: On-demand analytical requests initiated by business analysts, consuming 35% of capacity. These tasks typically complete within 5 minutes and have no SLA guarantees.

During infrastructure failures, the business mandate is clear: scheduled jobs must continue meeting SLA commitments, while user-initiated tasks may experience delays.

The CTO has mandated migration to AWS EC2 with the following non-negotiable constraints:

  1. Pay-as-you-go pricing with no long-term commitments
  2. Maintain high availability equivalent to current on-premises setup
  3. Preserve SLA compliance for scheduled workloads
  4. Minimize total cost of ownership

Key Requirements
#

Design an EC2 deployment architecture that balances workload criticality, availability requirements, and cost optimization using appropriate instance purchasing models and multi-AZ distribution.

The Options
#

  • A) Deploy 12 instances across two Availability Zones; each AZ runs 2 On-Demand instances with Capacity Reservations + 4 Spot instances.
  • B) Deploy 12 instances across three Availability Zones; one AZ runs 4 On-Demand instances with Capacity Reservations, remaining instances are Spot.
  • C) Deploy 12 instances across three Availability Zones; each AZ runs 2 On-Demand instances with Savings Plans + 2 Spot instances.
  • D) Deploy 12 instances across three Availability Zones; each AZ runs 3 On-Demand instances with Capacity Reservations + 1 Spot instance.

Correct Answer
#

Option A.


The Architect’s Analysis
#

Correct Answer
#

Option A — Deploy across two Availability Zones with 2 On-Demand + 4 Spot instances per AZ.

Step-by-Step Winning Logic
#

This solution demonstrates professional-grade capacity planning by aligning purchasing models with workload characteristics:

1. Workload-to-Purchasing Model Mapping
#

  • 4 On-Demand instances with Capacity Reservations (33% of fleet) provide guaranteed capacity for the 65% workload share through resource overcommitment — acceptable because:
    • Scheduled jobs have predictable execution windows
    • Not all 12 instances run at 100% simultaneously
    • SLA jobs get priority scheduling on guaranteed capacity
  • 8 Spot instances (67% of fleet) handle the 35% best-effort workload at 70-90% cost savings

2. Multi-AZ Strategy Aligned with HA Requirements
#

  • Two AZs provide sufficient fault tolerance for the stated “high availability” requirement without over-engineering
  • Each AZ maintains identical capacity mix (2 OD + 4 Spot), enabling active-active workload distribution
  • AZ failure scenario: Remaining AZ’s 2 On-Demand instances can absorb critical SLA workload while Spot handles overflow

3. FinOps Compliance with “No Long-Term Commitment”
#

  • Capacity Reservations are pay-as-you-go (charged whether used or not, but no 1-3 year lock-in)
  • Avoids Savings Plans (Option C) which require commitment despite flexibility
  • Spot instances maintain cost discipline on non-critical workload

The Traps (Distractor Analysis)
#

Why not Option B?
#

  • Single point of concentration risk: Placing all 4 On-Demand instances in one AZ creates an availability bottleneck
  • Imbalanced failure resilience: Loss of the primary AZ immediately violates SLA (only Spot remains in other AZs)
  • Over-reliance on Spot for critical workload: 8 Spot instances across 2 AZs means SLA jobs compete with best-effort tasks during Spot reclamation events

Why not Option C?
#

  • Violates “no long-term commitment” constraint: Savings Plans require 1 or 3-year commitments
  • Insufficient guaranteed capacity: Only 6 On-Demand instances (50% of fleet) creates SLA risk during peak scheduled job windows
  • Three AZs unnecessary: Adds 33% cost overhead for AZ data transfer and complexity without proportional availability gain

Why not Option D?
#

  • Over-provisioning of guaranteed capacity: 9 On-Demand instances (75% of fleet) for a 65% SLA workload wastes capital
  • Insufficient Spot allocation: Only 3 Spot instances (25% of fleet) for 35% best-effort workload forces costly On-Demand usage for user queries
  • Poor cost optimization: Fails to maximize Spot savings opportunity

The Architect Blueprint
#

graph TB subgraph "Region: us-east-1" subgraph "AZ-1a" OD1a[On-Demand + CapRes<br/>Instance 1] OD2a[On-Demand + CapRes<br/>Instance 2] SP1a[Spot Instance 3] SP2a[Spot Instance 4] SP3a[Spot Instance 5] SP4a[Spot Instance 6] end subgraph "AZ-1b" OD1b[On-Demand + CapRes<br/>Instance 7] OD2b[On-Demand + CapRes<br/>Instance 8] SP1b[Spot Instance 9] SP2b[Spot Instance 10] SP3b[Spot Instance 11] SP4b[Spot Instance 12] end end Scheduler[Job Scheduler] -->|SLA Jobs<br/>65% workload| OD1a Scheduler -->|SLA Jobs| OD2a Scheduler -->|SLA Jobs| OD1b Scheduler -->|SLA Jobs| OD2b Users[Business Analysts] -->|Ad-Hoc Queries<br/>35% workload| SP1a Users --> SP2a Users --> SP1b Users --> SP2b Scheduler -.->|Overflow during peak| SP3a Scheduler -.->|Overflow during peak| SP4a style OD1a fill:#ff9900,stroke:#232f3e,stroke-width:3px,color:#fff style OD2a fill:#ff9900,stroke:#232f3e,stroke-width:3px,color:#fff style OD1b fill:#ff9900,stroke:#232f3e,stroke-width:3px,color:#fff style OD2b fill:#ff9900,stroke:#232f3e,stroke-width:3px,color:#fff style SP1a fill:#69ae34,stroke:#232f3e,stroke-width:2px style SP2a fill:#69ae34,stroke:#232f3e,stroke-width:2px style SP1b fill:#69ae34,stroke:#232f3e,stroke-width:2px style SP2b fill:#69ae34,stroke:#232f3e,stroke-width:2px

Diagram Note: SLA-critical scheduled jobs route primarily to Capacity-Reserved On-Demand instances (orange) with Spot overflow capacity, while best-effort user queries leverage cost-optimized Spot instances (green) across two Availability Zones.

The Decision Matrix
#

Option Est. Complexity Est. Monthly Cost Pros Cons
A (Correct) Medium $3,200 (4 OD m5.2xlarge CapRes @ $0.384/hr = $1,105
8 Spot @ ~$0.12/hr = $691
+ Data transfer ~$400)
✅ Perfect workload-to-cost alignment
✅ SLA protection via guaranteed capacity
✅ 78% Spot cost savings on 67% of fleet
✅ Balanced multi-AZ resilience
⚠️ Requires intelligent job scheduler
⚠️ Spot interruption handling needed
B Low $2,950 (4 OD CapRes = $1,105
8 Spot = $691
3 AZ transfer = $350)
✅ Lowest operational complexity
✅ Three-AZ distribution
❌ Single AZ concentration risk
❌ SLA violation on primary AZ failure
❌ Imbalanced capacity distribution
C High $4,100 (6 OD with 1yr SP @ $0.277/hr = $1,990
6 Spot = $518
3 AZ transfer = $450)
✅ Three-AZ distribution
✅ Lower OD hourly rate with SP
Violates “no commitment” requirement
❌ Insufficient guaranteed capacity (50%)
❌ Higher total cost despite SP discount
❌ 3-AZ overhead unnecessary
D Medium $5,200 (9 OD CapRes = $2,488
3 Spot = $259
3 AZ transfer = $450)
✅ Maximum SLA safety margin
✅ Three-AZ resilience
62% cost premium over Option A
❌ Over-provisioned guaranteed capacity
❌ Underutilized Spot savings opportunity
❌ Poor FinOps efficiency

Cost Assumptions: m5.2xlarge instances (8 vCPU, 32 GB RAM) in us-east-1; On-Demand $0.384/hr, Spot average $0.115/hr (~70% savings); 730 hours/month; data transfer estimated at 5TB/month cross-AZ.

Real-World Practitioner Insight
#

Exam Rule
#

For SAP-C02, when you see “mixed workload criticality + no long-term commitment + cost optimization”, the answer pattern is:

  • Match purchasing model to workload SLA (On-Demand/Reserved for critical, Spot for best-effort)
  • Capacity Reservations ≠ long-term commitment (they’re pay-as-you-go)
  • Avoid over-engineering multi-AZ (2 AZs often sufficient unless explicitly stated otherwise)

Real World
#

In production, I would enhance Option A with:

  1. Spot Fleet diversification: Use multiple instance types (m5, m5a, m5n) across diversified Spot pools to reduce interruption probability from 5-10% to <2%
  2. Auto Scaling + Predictive Scaling: Adjust On-Demand baseline during known peak windows (month-end reporting) using CloudWatch metrics
  3. Savings Plans consideration after 3 months: Once workload patterns stabilize, evaluate Compute Savings Plans (no instance family lock-in) to reduce the On-Demand portion by additional 15-20% while maintaining flexibility
  4. Hybrid Reserved Instances: For the absolute baseline (e.g., 2 instances always running), consider Convertible RIs after 6 months of stable operations
  5. Third AZ for true mission-critical: If SLA penalties exceed $10K/hour, the additional ~$400/month for three-AZ deployment becomes insurance, not overhead

The exam tests purchasing model understanding; reality requires continuous FinOps optimization based on observed workload telemetry.

Mastering AWS Solutions Architect Professional (SAP-C02)

Advanced architectural patterns, multi-account governance, and complex migrations.