Skip to content

Glue Crawler

Schedule-driven schema discovery into Glue Data Catalog.

analytics
category15
settings2
inputs1
outputs

Configuration

Setting	Type	Required	Default
Crawler name	Text	Yes	`—`
Description	Text	—	`—`
Target database	Text	Yes	`—`
Cron schedule	Text	—	`—`
IAM role ARN	Text	—	`—`
Table prefix	Text	—	`—`
S3 paths	List	—	`—`
JDBC connections	List	—	`—`
DynamoDB tables	List	—	`—`
Recrawl policy Options: Crawl everything, Crawl new folders only, Event-driven	Choice	—	`CRAWL_EVERYTHING`
Schema change update behavior Options: Update in database, Log only	Choice	—	`UPDATE_IN_DATABASE`
Schema change delete behavior Options: Log only, Delete from database, Deprecate in database	Choice	—	`DEPRECATE_IN_DATABASE`
Custom classifiers	List	—	`—`
Configuration (JSON)	Text	—	`—`
Tags	Key–value	—	`—`

Connections

Socket	Direction	Accepts	Terraform arg
Source	Input	any	—
IAM role	Input	aws.iam-role	`role`
Data Catalog	Output	any	—