Why This Cheat Sheet Matters for DEA-C01
This cheat sheet covers the most important Amazon S3 concepts tested on the DEA-C01 (AWS Data Engineer Associate) certification exam. It contains 3 sections with 12 key points that you should memorize before exam day. S3 is the foundation of data lake storage on AWS. Master storage classes, lifecycle policies, partitioning strategies, S3 event notifications, S3 Select, and data organization for analytics. Use this as a quick-reference guide during your final review sessions.
3Sections
12Key Points
Data Organization
- Use partitioned prefix structures (e.g., year/month/day) to enable partition pruning in Athena and Spectrum.
- Store data in columnar formats (Parquet, ORC) for faster queries and lower costs.
- Use compression (Snappy, GZIP, ZSTD) to reduce storage and scan costs.
- S3 Select and Glacier Select retrieve subsets of objects without downloading the full object.
Lifecycle and Security
- Lifecycle policies transition objects between storage classes based on age.
- S3 Intelligent-Tiering automates class transitions for unpredictable access patterns.
- Bucket policies and IAM policies control access; Lake Formation adds fine-grained column/row control.
- S3 event notifications trigger Lambda, SQS, or SNS on object creation or deletion.
Exam Cues
- Need reduce Athena scan cost: partition data and use Parquet.
- Need automate tiering: S3 Intelligent-Tiering or lifecycle policies.
- Need trigger ETL on new data arrival: S3 event notification to Lambda or Glue.
- Need fine-grained data lake access: Lake Formation over S3 bucket policies.
Practice S3 Questions
Put your knowledge to the test with practice questions.