Skip to main content
  1. Home
  2. >
  3. AWS
  4. >
  5. SAA-C03
  6. >
  7. This article

AWS SAA-C03 Drill: IoT Telemetry Ingestion - The Serverless vs. Managed Infrastructure Trade-off

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | Multi-Cloud Architect & Strategist.
Jeff's Architecture Insights
Go beyond static exam dumps. Jeff’s Insights is engineered to cultivate the mindset of a Production-Ready Architect. We move past ‘correct answers’ to dissect the strategic trade-offs and multi-cloud patterns required to balance reliability, security, and TCO in mission-critical environments.

While preparing for the AWS SAA-C03, many candidates get confused by IoT ingestion patterns and storage tiering strategies. In the real world, this is fundamentally a decision about Operational Overhead vs. Total Cost of Ownership (TCO). Let’s drill into a simulated scenario.

The Scenario
#

GlobalSense Industries operates a fleet of thousands of environmental monitoring devices deployed across remote agricultural regions. These IoT sensors continuously transmit status alerts and environmental readings. The aggregate data volume reaches 1 TB per day, with each individual alert payload averaging 2 KB.

The Data Engineering team needs to design a cloud-native ingestion and storage solution with the following constraints:

  • High availability is mandatory (sensors operate 24/7)
  • Minimal operational burden (the team is small and focused on analytics, not infrastructure)
  • Cost optimization is a priority
  • Data retention policy: Keep the most recent 14 days readily accessible for real-time dashboards and ad-hoc queries; archive older data for compliance but with infrequent access expected

Key Requirements
#

Design the most operationally efficient solution that ingests high-volume IoT telemetry, enforces automatic time-based storage tiering, and minimizes infrastructure management overhead.

The Options
#

  • A) Create an Amazon Kinesis Data Firehose delivery stream to ingest alerts. Configure the stream to deliver data to an Amazon S3 bucket. Set an S3 Lifecycle policy to transition objects to Amazon S3 Glacier after 14 days.

  • B) Launch Amazon EC2 instances across two Availability Zones behind an Elastic Load Balancer to receive alerts. Deploy a custom script on each EC2 instance to write alerts to an Amazon S3 bucket. Set an S3 Lifecycle policy to transition objects to Amazon S3 Glacier after 14 days.

  • C) Create an Amazon Kinesis Data Firehose delivery stream to ingest alerts. Configure the stream to deliver data to an Amazon OpenSearch Service cluster. Set up daily manual snapshots in OpenSearch and implement a process to manually delete data older than 14 days from the cluster.

  • D) Create an Amazon SQS Standard Queue to receive alerts. Set the message retention period to 14 days. Configure consumer applications to poll the queue, analyze message age, process recent data, and copy messages older than 14 days to an Amazon S3 bucket before deleting them from the queue.

Correct Answer
#

Option A.


The Architect’s Analysis
#

Correct Answer
#

Option A: Kinesis Data Firehose → S3 → S3 Lifecycle (Glacier transition)

Step-by-Step Winning Logic
#

This solution achieves maximum operational efficiency by:

  1. Serverless Ingestion: Kinesis Firehose automatically scales to handle 1 TB/day without provisioning servers or managing scaling policies.
  2. Zero Infrastructure Management: No EC2 patching, no load balancer tuning, no custom consumer code maintenance.
  3. Automated Storage Tiering: S3 Lifecycle policies transition data to Glacier after 14 days automatically—no manual intervention, no cron jobs, no operational risk.
  4. High Availability by Default: Kinesis Firehose and S3 are multi-AZ services with built-in redundancy.
  5. Cost-Optimized Storage: S3 Standard for hot data (14 days), Glacier for cold archives (compliance, audit trails).

The Traps (Distractor Analysis)
#

Why not Option B (EC2 + ELB + Custom Scripts)?

  • Operational Overhead: Requires managing EC2 instances (patching, scaling, monitoring), ELB configuration, and custom script maintenance.
  • Higher TCO: EC2 compute costs + ELB costs exceed Firehose pricing for this workload.
  • Complexity: Custom code introduces failure points (script bugs, deployment issues).

Why not Option C (Firehose → OpenSearch + Manual Snapshots)?

  • Manual Processes: Daily manual snapshots and manual data deletion violate the “minimal operational burden” requirement.
  • Cost Inefficiency: Running an OpenSearch cluster 24/7 for simple storage/archival is overkill; OpenSearch is designed for search/analytics workloads, not cost-optimized archival.
  • Operational Risk: Manual processes = human error potential.

Why not Option D (SQS + Custom Consumers)?

  • Architectural Mismatch: SQS is a message queue, not a streaming data ingestion service. Using it for persistent storage and manual age-based archival is a misuse of the service.
  • Complex Logic: Custom consumer code to check message age, copy to S3, and delete from SQS adds unnecessary complexity.
  • SQS Retention Limits: SQS max retention is 14 days—this barely meets the requirement and offers no archival capability beyond that window.
  • Higher Operational Load: Requires deploying, monitoring, and scaling consumer applications.

The Architect Blueprint
#

graph TD IoT[Thousands of IoT Devices<br/>1TB/day, 2KB alerts] -->|HTTPS/TLS| Firehose[Amazon Kinesis<br/>Data Firehose<br/>Auto-scaling, Serverless] Firehose -->|Batching & Compression| S3Hot[Amazon S3 Bucket<br/>S3 Standard Storage<br/>0-14 days data] S3Hot -->|S3 Lifecycle Policy<br/>After 14 days| Glacier[Amazon S3 Glacier<br/>Long-term Archive<br/>Cost-optimized cold storage] style Firehose fill:#FF9900,stroke:#232F3E,stroke-width:2px,color:#fff style S3Hot fill:#569A31,stroke:#232F3E,stroke-width:2px,color:#fff style Glacier fill:#5294CF,stroke:#232F3E,stroke-width:2px,color:#fff

Diagram Note: IoT devices stream alerts to Kinesis Firehose (serverless ingestion), which delivers batched data to S3 Standard; an automated Lifecycle policy transitions data to Glacier after 14 days—zero manual intervention required.

The Decision Matrix
#

Option Est. Complexity Est. Monthly Cost (1TB/day) Pros Cons
A: Firehose + S3 + Lifecycle Low ~$450-600
(Firehose: $280, S3 Standard 14d: $70, Glacier: $30, Data Transfer: $50)
✅ Fully serverless
✅ Auto-scaling
✅ Automated archival
✅ High availability built-in
⚠️ Slightly higher ingestion cost vs. raw EC2 (at massive scale)
B: EC2 + ELB + S3 High ~$800-1,200
(EC2: $300-500, ELB: $150, S3: $100, Ops time: $200-400)
✅ Full control over logic ❌ EC2 management overhead
❌ Custom code maintenance
❌ Higher TCO
❌ Scaling complexity
C: Firehose + OpenSearch Very High ~$1,500-2,500
(Firehose: $280, OpenSearch cluster: $1,000-2,000, Snapshots: $100-200)
✅ Real-time search capability ❌ Manual snapshot management
❌ Manual data deletion
❌ Expensive for archival use case
❌ Operational complexity
D: SQS + Custom Consumers High ~$650-900
(SQS: $100-150, EC2 consumers: $300-500, S3: $100, Dev/Ops: $150-250)
✅ Decoupled architecture ❌ SQS not designed for storage
❌ Custom age-checking logic
❌ Complex consumer management
❌ No native archival

FinOps Insight: Option A delivers the lowest TCO when factoring in engineering time (OpEx) and infrastructure costs (CapEx). The slightly higher Firehose ingestion cost is offset by zero operational overhead and automated lifecycle management.

Real-World Practitioner Insight
#

Exam Rule
#

For the SAA-C03 exam, when you see “high-volume streaming data ingestion + minimal operational overhead + automated storage tiering”, always favor Kinesis Firehose + S3 + Lifecycle policies. This is the canonical AWS serverless ingestion pattern.

Real World
#

In production, we’d likely enhance Option A with:

  • Data transformation in Firehose (e.g., Lambda to filter/enrich before S3)
  • Partitioning strategy in S3 (by device ID, timestamp) for efficient Athena queries
  • S3 Intelligent-Tiering instead of manual Lifecycle if access patterns are unpredictable
  • CloudWatch alarms on Firehose delivery failures
  • AWS Glue Data Catalog + Athena for SQL queries on the 14-day “hot” data
  • Consideration of S3 Glacier Instant Retrieval (vs. Glacier Flexible Retrieval) if occasional fast access to archived data is needed

We’d also evaluate AWS IoT Core as the ingestion layer (instead of raw HTTPS to Firehose) if devices need:

  • Device shadows
  • Rules engine for routing
  • Device registry/authentication

However, for the exam scenario as stated, Option A is the textbook answer.

Weekly AWS SAA-C03 Drills: Think Like a CTO

Get 3-5 high-frequency scenarios every week. No brain-dumping, just pure architectural trade-offs.