AWS SAA-C03 Drill: Scalable File Storage for Variable Workloads - The Shared Storage Trade-off

Table of Contents

While preparing for the AWS SAA-C03, many candidates confuse storage service selection by focusing only on capacity. In the real world, this is fundamentally a decision about shared access patterns vs. operational complexity. Let’s drill into a simulated scenario.

The Scenario
#

TechMedia Solutions, a digital content processing company, is migrating their on-premises video rendering application to AWS. The application generates output files ranging from 50 GB (compressed video clips) to 300 TB (feature-length film projects). These files must be accessible through standard POSIX file system operations (read/write/append) as the rendering engine expects traditional file paths like /mnt/renders/project-X/scene-042.mp4.

The business expects traffic to spike unpredictably—during film festival season, render jobs increase 10x within hours. The infrastructure team has only two engineers and cannot afford complex storage management overhead.

Key Requirements
#

Design a storage solution that:

Supports standard file system structure (not object key-value)
Scales automatically from gigabytes to hundreds of terabytes
Provides high availability across multiple data centers
Minimizes operational overhead (no manual volume resizing or replication management)

The Options
#

A) Containerize the application on Amazon ECS and use Amazon S3 for storage
B) Containerize the application on Amazon EKS and use Amazon EBS for storage
C) Deploy the application on EC2 instances in a Multi-AZ Auto Scaling Group and use Amazon EFS for storage
D) Deploy the application on EC2 instances in a Multi-AZ Auto Scaling Group and use Amazon EBS for storage

Correct Answer
#

Option C.

The Architect’s Analysis
#

Correct Answer
#

Option C - EC2 Auto Scaling Group with Amazon EFS.

Step-by-Step Winning Logic
#

This solution achieves all four requirements through strategic service pairing:

File System Requirement → EFS: Amazon EFS provides NFSv4 protocol support, allowing applications to use standard POSIX file operations (open(), write(), mkdir()) without code modification. The application sees /mnt/efs/renders/ exactly like local storage.
Automatic Scaling → EFS Elastic Storage: Unlike EBS (which requires pre-provisioning and manual resizing), EFS automatically grows and shrinks based on actual file storage. A 50 GB project pays for 50 GB; a 300 TB project scales transparently without intervention.
High Availability → EFS Multi-AZ: EFS automatically replicates data across multiple Availability Zones within a region. Combined with EC2 Auto Scaling across AZs, this creates infrastructure-level fault tolerance.
Minimal Operations → Managed Service: No volume snapshots to schedule, no replication scripts to maintain, no capacity alerts to monitor—EFS handles durability and scaling as a fully managed service.

The Traps (Distractor Analysis)
#

Why not Option A (ECS + S3)?

Storage Incompatibility: S3 is an object store using HTTP APIs, not a file system. While tools like s3fs-fuse can mount S3 as a file system, they introduce significant latency (every file operation becomes API calls) and break POSIX semantics (no true append operations, eventual consistency issues).
Operational Overhead: Requires application refactoring or maintaining FUSE mount drivers—contradicting the “minimal overhead” requirement.

Why not Option B (EKS + EBS)?

Shared Access Failure: EBS volumes attach to only one EC2 instance at a time (except io2 Multi-Attach with limitations). In an auto-scaling environment with multiple render nodes, you cannot share the same EBS volume across instances.
Kubernetes Complexity: EKS adds orchestration overhead (cluster management, pod scheduling, persistent volume claims) without solving the core storage sharing problem.

Why not Option D (EC2 + EBS)?

Manual Scaling Required: EBS volumes must be pre-sized. If you provision 10 TB but need 50 TB, you must manually resize (and potentially change instance types to support larger volumes).
No Cross-Instance Sharing: Same issue as Option B—each EC2 instance would need its own EBS volume, requiring complex file synchronization mechanisms between nodes.
Availability Risk: EBS volumes exist in a single AZ. Achieving Multi-AZ requires custom replication (e.g., DRBD, application-level sync), drastically increasing operational complexity.

The Architect Blueprint
#

graph TB subgraph "Multi-AZ Auto Scaling Group" EC2_1[EC2 Instance - AZ-1a Render Node] EC2_2[EC2 Instance - AZ-1b Render Node] EC2_3[EC2 Instance - AZ-1c Render Node] end subgraph "Amazon EFS (Regional Service)" EFS[(EFS File System Elastic Storage 0 GB → 300 TB)] MT_1[Mount Target - AZ-1a] MT_2[Mount Target - AZ-1b] MT_3[Mount Target - AZ-1c] end ELB[Application Load Balancer] --> EC2_1 ELB --> EC2_2 ELB --> EC2_3 EC2_1 -.NFS Mount.-> MT_1 EC2_2 -.NFS Mount.-> MT_2 EC2_3 -.NFS Mount.-> MT_3 MT_1 --> EFS MT_2 --> EFS MT_3 --> EFS ASG[Auto Scaling Policy CPU > 70% → Scale Out] -.Controls.-> EC2_1 ASG -.Controls.-> EC2_2 ASG -.Controls.-> EC2_3 style EFS fill:#FF9900,stroke:#232F3E,color:#FFFFFF style EC2_1 fill:#527FFF,stroke:#232F3E,color:#FFFFFF style EC2_2 fill:#527FFF,stroke:#232F3E,color:#FFFFFF style EC2_3 fill:#527FFF,stroke:#232F3E,color:#FFFFFF

Diagram Note: All EC2 instances across availability zones mount the same EFS file system concurrently via local mount targets, creating a shared storage layer that scales elastically with zero operational intervention.

The Decision Matrix
#

Option	Est. Complexity	Est. Monthly Cost (100 TB Scenario)	Pros	Cons
A) ECS + S3	Medium	$2,300 (S3 Standard: $2,300)	✅ Lowest storage cost ✅ Infinite scalability ✅ High durability (11 nines)	❌ Not a true file system ❌ Requires app refactoring ❌ FUSE mounts add latency
B) EKS + EBS	Very High	$10,000+ (EBS gp3: $8,000 + EKS: $73/cluster + data transfer)	✅ High IOPS potential ✅ Kubernetes ecosystem	❌ EBS can’t share across instances ❌ Complex PV/PVC management ❌ Kubernetes overhead
C) EC2 + EFS ✅	Low	$30,000 (EFS Standard: $0.30/GB × 100,000 GB)	✅ True POSIX file system ✅ Auto-scales elastically ✅ Multi-AZ built-in ✅ Zero resize operations	⚠️ Higher cost than S3 ⚠️ Performance depends on throughput mode
D) EC2 + EBS	High	$8,000 (EBS gp3: $0.08/GB × 100,000 GB × 1 volume)	✅ Lower cost than EFS ✅ Predictable performance	❌ Manual volume resizing ❌ Single-AZ durability ❌ No shared access

FinOps Insight: While EFS costs 13x more than S3 for raw storage, it eliminates:

Development costs for S3 API integration (~$50K engineering time)
Ongoing FUSE mount troubleshooting (~20 hrs/month ops time)
Kubernetes licensing/training for EKS (~$15K/year)

For this use case, the $30K/month EFS cost is justified by operational savings and meeting native file system requirements.

Real-World Practitioner Insight
#

Exam Rule
#

“For the AWS SAA-C03, when you see ‘standard file system structure’ + ‘auto-scaling’ + ‘high availability’, immediately think Amazon EFS. If the question mentions ‘object storage API’ or ‘static website,’ choose S3. If it specifies ‘single instance’ and ‘maximum IOPS,’ pick EBS.”

Real World
#

In production, we’d add nuance:

Hybrid Tiering: Use EFS Intelligent-Tiering to automatically move infrequently accessed files (old renders) to IA storage class, reducing costs by ~92% ($0.025/GB vs. $0.30/GB).
Performance Modes: For write-heavy rendering, configure EFS with Max I/O performance mode instead of General Purpose, trading slight latency for higher aggregate throughput.
Cost Optimization: For completed projects (read-only access), consider archiving to S3 Glacier Deep Archive ($0.00099/GB/month) and keeping only active projects on EFS.
Burst vs. Provisioned: Start with Bursting Throughput mode (included in storage price) and monitor CloudWatch metrics. Only switch to Provisioned Throughput if consistent high-speed access is needed (adds ~$6/MB/s/month).
Access Patterns: If 90% of files are accessed once and archived, reconsider the architecture entirely—maybe a Lambda + S3 pipeline with EFS only for active rendering would cut costs by 70%.

The exam wants you to recognize the pattern; reality demands you quantify the exceptions.

The Scenario #

Key Requirements #

The Options #

Correct Answer #

The Architect’s Analysis #

Correct Answer #

Step-by-Step Winning Logic #

The Traps (Distractor Analysis) #

The Architect Blueprint #

The Decision Matrix #

Real-World Practitioner Insight #

Exam Rule #

Real World #

Related Articles

Weekly AWS SAA-C03 Drills: Think Like a CTO

The Scenario
#

Key Requirements
#

The Options
#

Correct Answer
#

The Architect’s Analysis
#

Correct Answer
#

Step-by-Step Winning Logic
#

The Traps (Distractor Analysis)
#

The Architect Blueprint
#

The Decision Matrix
#

Real-World Practitioner Insight
#

Exam Rule
#

Real World
#