📋 Athena Cheat Sheet

Athena is tested for serverless SQL analytics over S3 data lakes, cost optimization, and federated queries.

Why This Cheat Sheet Matters for DEA-C01

This cheat sheet covers the most important Amazon Athena concepts tested on the DEA-C01 (AWS Data Engineer Associate) certification exam. It contains 3 sections with 12 key points that you should memorize before exam day. Amazon Athena is a serverless interactive query service. Learn about partitioning, columnar formats (Parquet, ORC), workgroups, CTAS, federated queries, and cost optimization. Use this as a quick-reference guide during your final review sessions.

3Sections
12Key Points

Query Optimization

  • Partition data in S3 to enable partition pruning and reduce scanned data.
  • Use columnar formats (Parquet, ORC) to scan only needed columns.
  • Compress data (Snappy, ZSTD, GZIP) to reduce scan volume and cost.
  • CTAS (CREATE TABLE AS SELECT) creates optimized tables from query results.

Management

  • Workgroups separate users, enforce cost limits, and track query usage.
  • Federated queries use data source connectors to query RDS, DynamoDB, and other sources.
  • Athena uses the Glue Data Catalog as its default metastore.
  • Pricing is based on data scanned — optimization reduces both cost and latency.

Exam Cues

  • Need serverless SQL on S3: Athena.
  • Need reduce Athena costs: partition, use Parquet, and compress.
  • Need query external databases from Athena: federated queries.
  • Need limit per-team query spend: Athena workgroups with data scan limits.

Practice Athena Questions

Put your knowledge to the test with practice questions.

More DEA-C01 Cheat Sheets