Question
What does a Glue crawler do?
Click to reveal answer
Answer
Discovers data schema and populates the Glue Data Catalog with table definitions.
Click to flip back
All AWS Glue Flashcards
Q: What does a Glue crawler do?
A: Discovers data schema and populates the Glue Data Catalog with table definitions.
Q: What is the Glue Data Catalog?
A: A centralized metadata repository that stores database and table definitions, used by Athena, EMR, and Redshift Spectrum.
Q: How do Glue job bookmarks help?
A: They track previously processed data so incremental ETL jobs only process new or changed data.
Q: What is Glue DataBrew?
A: A visual data preparation tool for cleaning and normalizing data without writing code.
Q: What execution engine do Glue ETL jobs use?
A: Apache Spark (PySpark or Scala), or Python Shell for lightweight jobs.
Q: What is a DynamicFrame?
A: A Glue extension of Spark DataFrames that handles schema inconsistencies and provides built-in transforms.