📋 Athena Cheat Sheet

Athena is tested for serverless SQL analytics over S3 data lakes, cost optimization, and federated queries.

Query Optimization

  • Partition data in S3 to enable partition pruning and reduce scanned data.
  • Use columnar formats (Parquet, ORC) to scan only needed columns.
  • Compress data (Snappy, ZSTD, GZIP) to reduce scan volume and cost.
  • CTAS (CREATE TABLE AS SELECT) creates optimized tables from query results.

Management

  • Workgroups separate users, enforce cost limits, and track query usage.
  • Federated queries use data source connectors to query RDS, DynamoDB, and other sources.
  • Athena uses the Glue Data Catalog as its default metastore.
  • Pricing is based on data scanned — optimization reduces both cost and latency.

Exam Cues

  • Need serverless SQL on S3: Athena.
  • Need reduce Athena costs: partition, use Parquet, and compress.
  • Need query external databases from Athena: federated queries.
  • Need limit per-team query spend: Athena workgroups with data scan limits.

Practice Athena Questions

Put your knowledge to the test with practice questions.

More DEA-C01 Cheat Sheets