GCP PCA Drill: Compute Engine Zonal Resiliency - The Multi-Zone Disk Trade-off

Table of Contents

While preparing for the GCP Professional Cloud Architect (PCA) exam, many candidates get confused by high availability design patterns in Compute Engine. In the real world, this is fundamentally a decision about cross-zone data durability versus operational complexity and cost overhead. Let’s drill into a simulated scenario.

The Scenario
#

HyperArc Gaming, a global multiplayer game studio, hosts its real-time game servers on Google Compute Engine VM instances. The studio faces strict SLA requirements: whenever a zonal outage disrupts one availability zone, the game server and its application data must be restored instantly in another zone, minimizing player downtime and data loss. HyperArc wants the fastest failover with the latest persisted game state, while also controlling operational and storage costs.

Requirements
#

Design a highly available Compute Engine architecture so that if an availability zone experiences an outage, the game server app is restored quickly in another zone — with access to the latest application data.

The Options
#

A) Create a snapshot schedule for the disk containing game state data. If a zonal outage occurs, restore the disk from the latest snapshot in the same zone.
B) Use an instance template for game servers and attach a regional persistent disk for data storage. On zonal failure, spin up a new instance in another zone within the same region, using the regional disk.
C) Create a snapshot schedule for the disk containing game state data. On zonal failure, restore the disk from the latest snapshot in another zone within the same region.
D) Use an instance template for game servers with a regional persistent disk, but on failure spin up a new instance in another region, re-attaching the disk.

Correct Answer
#

Option B.

The Architect’s Analysis
#

Correct Answer
#

Option B

Step-by-Step Winning Logic
#

Using an instance template simplifies scalable and repeatable VM provisioning. Attaching a regional persistent disk ensures that the application data is synchronously replicated across multiple zones in the same region. This enables near-instant failover to a new zone with the latest game state data intact, matching the requirement for fastest recovery and minimal data loss. This approach embodies Google’s SRE principle of reducing toil by using managed services that handle replication automatically while supporting “cattle not pets” VM management.

The Traps (Distractor Analysis)
#

Why not A?
Restoring snapshots in the same zone fails to address zonal outages, providing no high availability. Also, snapshot restores are not instantaneous, increasing recovery time.
Why not C?
Restoring snapshots in another zone is possible but slower than using regional persistent disks. Snapshot restore involves management overhead and delays that violate the fast recovery requirement.
Why not D?
Spinning up instances in a different region introduces cross-region latency, potential data residency complications, and is operationally more complex. Regional persistent disks are scoped to a region and cannot be attached across regions, so the solution is technically invalid.

The Architect Blueprint
#

Mermaid Diagram illustrating the failover flow:

graph TD User([Players]) --> LoadBalancer[Global Game Load Balancer] LoadBalancer --> GCE_VM_1[Compute Engine VM - Zone A] LoadBalancer --> GCE_VM_2[Compute Engine VM - Zone B] GCE_VM_1 -->|Attached to| RegionalDisk[Regional Persistent Disk] GCE_VM_2 -->|Attached to| RegionalDisk style LoadBalancer fill:#4285F4,stroke:#333,color:#fff style RegionalDisk fill:#f4a261,stroke:#333,color:#000

Diagram Note: The regional persistent disk synchronously replicates data across zones, enabling VM failover without data loss or manual disk restore.

The Decision Matrix
#

Option	Est. Complexity	Est. Monthly Cost	Pros	Cons
A	Low	Low (snapshot storage, no replication)	Low cost snapshot backups	Restore only in same zone; no cross-zone HA, slow recovery
B	Medium	Medium-High (regional disk premium)	Near-instant failover; synchronous data replication; managed	Higher disk cost; slight complexity in setup
C	Medium	Low-Medium (snapshot storage)	Allows cross-zone restore	Slow restore process; increased recovery time; manual steps
D	High	Very High (cross-region resources)	Potential for disaster recovery across regions	Regional disks can’t attach cross-region; complexity; latency

Real-World Practitioner Insight
#

Exam Rule
#

For the exam, always pick Regional Persistent Disks + Instance Templates when you see zonal failover with latest data requirements.

Real World
#

Often, snapshots are used for backups, but not as a primary HA mechanism due to restore latency. For ultra-low RTO, regional persistent disks are essential. Cross-region failover typically requires more complex solutions like managed databases with cross-region replication or multi-region storage buckets.

The Scenario #

Requirements #

The Options #

Correct Answer #

The Architect’s Analysis #

Correct Answer #

Step-by-Step Winning Logic #

The Traps (Distractor Analysis) #

The Architect Blueprint #

The Decision Matrix #

Real-World Practitioner Insight #

Exam Rule #

Real World #

Related Articles

GCP Professional Cloud Architect Drills

The Scenario
#

Requirements
#

The Options
#

Correct Answer
#

The Architect’s Analysis
#

Correct Answer
#

Step-by-Step Winning Logic
#

The Traps (Distractor Analysis)
#

The Architect Blueprint
#

The Decision Matrix
#

Real-World Practitioner Insight
#

Exam Rule
#

Real World
#