While preparing for the AWS SAA-C03, many candidates get confused by AWS Systems Manager’s patch management suite. In the real world, this is fundamentally a decision about urgency vs. automation governance. Let’s drill into a simulated scenario.
The Scenario #
GlobalFinTech Corp operates a payment processing platform running on 1,000 Amazon EC2 Linux instances distributed across multiple Availability Zones. The platform uses third-party middleware software licensed from a security vendor.
On Monday morning, the vendor published a critical security advisory (CVSSv3 score: 9.8) affecting all versions of their software, with active exploits detected in the wild. The security team has classified this as a P0 incident requiring patching within 4 hours to meet regulatory compliance (PCI-DSS).
The patches are available via standard package managers (yum/apt), and all instances have the AWS Systems Manager Agent (SSM Agent) already installed and functioning.
Key Requirements #
Deploy the critical security patch to all 1,000 EC2 instances as quickly as possible while maintaining audit trails for compliance reporting.
The Options #
- A) Create an AWS Lambda function that uses the EC2 API to SSH into each instance sequentially and execute the patch command.
- B) Configure AWS Systems Manager Patch Manager with a new patch baseline and wait for the next automated scan cycle to apply patches.
- C) Schedule an AWS Systems Manager Maintenance Window to execute during the next planned maintenance period (48 hours away).
- D) Use AWS Systems Manager Run Command to execute a custom patching script across all 1,000 instances immediately.
Correct Answer #
Option D.
The Architect’s Analysis #
Correct Answer #
Option D: AWS Systems Manager Run Command
Step-by-Step Winning Logic #
This scenario contains a critical time constraint: “as quickly as possible” to address a P0 security incident. Let’s break down why Run Command wins:
1. Immediate Execution (Speed) #
- Run Command executes on-demand across all tagged instances or instance groups simultaneously
- No waiting for scheduled windows or automated cycles
- Parallel execution across 1,000 instances (default concurrency: 50 instances, customizable to 100%)
2. Native SSM Agent Integration #
- The scenario explicitly states SSM Agent is already installed
- No additional infrastructure setup required (unlike Lambda-based solutions)
- Built-in rate controls prevent overwhelming the fleet
3. Compliance & Auditability #
- Automatic logging to CloudWatch Logs and S3 (if configured)
- Command history retained in Systems Manager for 30 days (extendable)
- Output from every instance captured for validation
4. Cost Efficiency #
- No charge for Run Command service usage
- Only pay for underlying EC2 compute time (already running)
- No Lambda invocation costs or custom orchestration overhead
The Traps (Distractor Analysis) #
Why not Option A (Lambda + SSH)? #
Technical Deficiencies:
- Sequential execution is implied by “SSH into each instance” → 1,000 instances × 2 minutes = 33+ hours (violates the 4-hour SLA)
- Lambda has a 15-minute timeout limit—can’t handle long-running operations
- Requires managing SSH keys securely (IAM roles, Secrets Manager overhead)
- No native inventory or compliance reporting
Cost Impact:
- Lambda invocations: 1,000 × $0.0000002 = negligible
- But requires custom error handling, retry logic, and state management (development cost: $5,000–$15,000)
Exam Trap: AWS deliberately includes “Lambda” to test if you recognize when serverless isn’t appropriate for fleet management.
Why not Option B (Patch Manager)? #
Timing Mismatch:
- Patch Manager is designed for scheduled, policy-driven patching
- Requires creating a patch baseline, associating it with instances, and waiting for the scan-and-patch cycle (typically 30-minute intervals at best)
- Total time to first patch application: 45–90 minutes minimum (too slow for P0)
When to Use Instead:
- Ongoing compliance (e.g., monthly Windows Update cycles)
- Automated patching during approved maintenance windows
- Organizations requiring Change Advisory Board (CAB) approvals
FinOps Note: Patch Manager is free, but the delay risk costs more than the service savings.
Why not Option C (Maintenance Window)? #
Fatal Flaw:
- The scenario states the window is 48 hours away
- This violates the 4-hour P0 requirement explicitly
- Regulatory frameworks (PCI-DSS, HIPAA) impose strict remediation timelines
When to Use Instead:
- Non-critical patches (e.g., feature updates)
- Planned downtime for stateful applications
- Coordinating patching with database maintenance or blue/green deployments
Exam Hint: Any time you see “immediately,” “urgent,” or “critical security vulnerability,” eliminate options with scheduling delays.
The Architect Blueprint #
Diagram Note: Run Command uses SSM Agent on each instance to execute the patch script in parallel, with all outputs centralized in CloudWatch Logs and optionally archived to S3 for long-term compliance retention.
The Decision Matrix #
| Option | Est. Time to Patch | Est. Cost | Pros | Cons |
|---|---|---|---|---|
| A) Lambda + SSH | 33+ hours (sequential) | $12,000 (dev cost) | Familiar to DevOps teams | Too slow, timeout risks, key management |
| B) Patch Manager | 60–90 minutes | $0 (service) | Automated compliance tracking | Requires baseline setup, scan delay |
| C) Maintenance Window | 48 hours | $0 (service) | Change-controlled, auditable | Misses P0 deadline, regulatory risk |
| D) Run Command ✅ | 5–10 minutes | $0.67 (compute only) | Immediate, parallel, auditable, free | Requires SSM Agent pre-installed |
FinOps Insight: The $0.67 execution cost vs. potential $50K+ regulatory fine creates a 7,500,000% ROI on urgency.
Real-World Practitioner Insight #
Exam Rule #
“For AWS SAA-C03, when you see ‘immediately,’ ‘urgent,’ or ‘critical security’ combined with ‘EC2 fleet management’, always select Run Command over scheduled solutions (Patch Manager/Maintenance Windows).”
Real World #
In production environments, we’d typically use a hybrid approach:
-
Immediate Response (This Scenario):
- Use Run Command for P0 hotfixes
- Target instances using dynamic tags (e.g.,
PatchGroup:Critical)
-
Long-Term Governance:
- Implement Patch Manager with baseline policies
- Configure Maintenance Windows for routine updates
- Use AWS Config Rules to verify compliance (e.g.,
ec2-managedinstance-patch-compliance-status-check)
-
Enhancements Not in the Exam:
- Pre-patching snapshots using AWS Backup or EBS snapshots
- Canary deployments: Patch 10% of fleet first, monitor for 15 minutes, then proceed
- Integration with ServiceNow for change management tickets
- Automated rollback if health checks fail (using CloudWatch Alarms + Lambda)
Real-World Cost Consideration: If this were Windows Server with required reboots, we’d factor in:
- Application downtime during restarts
- Auto Scaling to provision replacement capacity temporarily
- Blue/green deployment for zero-downtime patching (adds ~$200/hour for duplicate fleet during transition)