📋 Redshift Cheat Sheet

Redshift is the primary data warehouse service on DEA-C01, covering loading, distribution, optimization, and Spectrum queries.

Architecture

  • Leader node parses queries and distributes work; compute nodes store data and execute queries.
  • Distribution styles: AUTO, KEY, EVEN, ALL — choose based on join patterns and table size.
  • Sort keys enable zone maps for efficient range-based filtering.
  • Redshift Serverless provides on-demand capacity without cluster management.

Loading and Querying

  • The COPY command is the fastest way to load data from S3, DynamoDB, or EMR.
  • UNLOAD exports query results to S3 in parallel.
  • Redshift Spectrum queries data directly in S3 without loading it into Redshift tables.
  • Materialized views store precomputed results for frequently executed queries.

Exam Cues

  • Need query S3 data without loading: Redshift Spectrum.
  • Need fastest S3-to-Redshift ingestion: COPY command.
  • Need prioritize critical queries: workload management (WLM) queues.
  • Need reduce storage cost for infrequent data: keep in S3, query via Spectrum.

Practice Redshift Questions

Put your knowledge to the test with practice questions.

More DEA-C01 Cheat Sheets