💾 Storing and Managing Data - PDE Practice Questions

Choose the right storage: BigQuery, Cloud Storage, Cloud SQL, Spanner, Bigtable, Firestore based on requirements.

6Questions Available
1Exam Domains

Practice Storing & Managing Questions Now

Start a timed practice session focusing on Storing and Managing Data topics from the PDE question bank.

Start PDE Practice Quiz →

PDE Storing & Managing Question Bank (6 Questions)

Browse all 6 practice questions covering Storing and Managing Data for the PDE certification exam. Each question includes the full answer and a detailed explanation to help you understand the concepts.

  1. Question 1Ingesting and Processing Data

    You need to replicate changes from a Cloud SQL PostgreSQL database to BigQuery in near real-time. What approach should you use?

    AExport and import daily
    BDatastream — a serverless change data capture (CDC) service that continuously replicates database changes to BigQuery with low latency
    CWrite triggers in PostgreSQL
    DUse a cron job to query changes
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    Datastream: serverless CDC. Sources: MySQL, PostgreSQL, Oracle, SQL Server, AlloyDB. Destinations: BigQuery, Cloud Storage, Cloud SQL. Features: continuous replication, schema changes propagated, minimal impact on source. Setup: create connection profiles (source + destination), create stream with table selection. Near real-time: seconds to minutes latency.

  2. Question 2Storing and Managing Data

    What file format should you use when storing data in Cloud Storage for BigQuery external tables?

    ACSV always
    BParquet or ORC — columnar formats that support predicate pushdown, compression, and schema evolution, significantly reducing data scanned by BigQuery
    CJSON always
    DXML
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    Parquet/ORC: columnar (read only needed columns), compressed (smaller storage), schema embedded, predicate pushdown (BigQuery skips irrelevant row groups). vs CSV/JSON: row-based (must scan all columns), no predicate pushdown, larger files. Parquet preferred on GCP. BigQuery native tables use Capacitor (Google's columnar format). External tables: Parquet for performance, Avro for schema evolution.

  3. Question 3Ingesting and Processing Data

    Which approach efficiently loads large CSV files from Cloud Storage into BigQuery?

    AINSERT statements via API
    BBigQuery load job with auto-detect schema
    CStreaming inserts
    DBigQuery COPY command
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    BigQuery load jobs provide the most efficient way to bulk-load data from Cloud Storage, with automatic schema detection, format support, and no charge for load operations.

  4. Question 4Storing and Managing Data

    When should you use Firestore vs. Bigtable for NoSQL storage?

    AAlways use Firestore
    BFirestore for document-oriented data with rich queries and real-time sync; Bigtable for high-throughput, low-latency time-series/IoT data
    CAlways use Bigtable
    DThey are the same
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    Firestore: document model, rich queries, real-time sync for mobile/web. Bigtable: wide-column, millisecond latency at petabyte scale, optimized for high-throughput sequential reads/writes (IoT, time-series).

  5. Question 5Storing the Data

    When should you use Cloud Spanner vs Cloud SQL?

    AAlways use Spanner
    BSpanner for global scale, strong consistency, and horizontal scaling. Cloud SQL for regional workloads, standard PostgreSQL/MySQL compatibility, and lower cost.
    CAlways use Cloud SQL
    DSame performance
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    Spanner: globally distributed, strongly consistent, horizontal scaling, 99.999% SLA. Cost: higher. Cloud SQL: regional, vertical scaling (up to 128 vCPU), standard MySQL/PostgreSQL/SQL Server. Choose Spanner for: global apps, financial transactions. Choose SQL for: regional, standard workloads, budget constraints.

  6. Question 6Storing Data

    When should you use Bigtable vs BigQuery?

    ASame use case
    BBigtable: millisecond-latency reads/writes for time-series, IoT, and real-time analytics workloads. BigQuery: serverless analytical queries (seconds) for OLAP, reporting, and ad-hoc SQL analysis on large datasets.
    CBigtable for SQL queries
    DBigQuery for real-time writes
    Show Answer & Explanation
    Correct Answer: B
    Explanation:

    Bigtable: NoSQL wide-column, <10ms reads/writes, TB-PB scale, row-key design critical, no SQL (but integrates with BigQuery for analytics). Use for: time-series (IoT, financial), real-time serving, and high-throughput workloads. BigQuery: columnar SQL analytics, seconds per query, serverless, automatic optimization. Use for: data warehouse, BI/reporting, ML (BQML), and log analytics.

Key Storing & Managing Concepts for PDE

bigquerycloud storagecloud sqlspannerbigtablefirestoredata lake

PDE Storing & Managing Exam Tips

Storing and Managing Data questions in PDE are typically scenario-based. Focus on service-level decision making aligned to official exam objectives. Priority concepts: bigquery, cloud storage, cloud sql, spanner, bigtable, firestore.

What PDE Expects

  • Anchor your answer in select the most practical, secure, and scalable answer for the stated scenario.
  • Storing & Managing scenarios for PDE are frequently mapped to Domain 3 (~20%), so read the objective carefully before picking controls or architecture.
  • Expect multi-service scenarios where Storing & Managing interacts with IAM, networking, storage, or observability patterns rather than appearing as an isolated service question.
  • When two options are both technically valid, prefer the choice that best aligns with the exam's operational scope (Professional) and managed-service best practices.

High-Value Storing & Managing Concepts

  • Know the core Storing & Managing building blocks cold: bigquery, cloud storage, cloud sql, spanner.
  • Review the edge-case features and limits for bigtable, firestore; these details are commonly used to differentiate answer choices.
  • Practice service-integration reasoning: how Storing & Managing pairs with Data Processing, Ingesting & Processing in real deployment patterns.
  • For PDE, explain why the chosen Storing & Managing design meets reliability, security, and cost expectations better than the alternatives.

Common PDE Traps

  • Watch for answers that partially solve the requirement but miss operational constraints.
  • Questions in Storing and Managing Data often include distractors that look correct for Storing & Managing but violate least-privilege, durability, or availability requirements.
  • Avoid picking options purely by feature name; validate data path, failure handling, and governance impact before answering.
  • If the prompt hints at automation or repeatability, eliminate manual-only operational answers first.

Fast Review Checklist

  • Can you compare at least two Storing & Managing implementation paths and justify which one best fits the scenario?
  • Can you map the chosen answer back to Storing and Managing Data (~20%) outcomes for PDE?
  • Can you explain security and access boundaries for Storing & Managing without relying on default-open assumptions?
  • Can you describe how Storing & Managing integrates with Data Processing and Ingesting & Processing during failure, scaling, and monitoring events?

Exam Domains Covering Storing & Managing

Related Resources

More PDE Study Resources