Glue Crawler
Schedule-driven schema discovery into Glue Data Catalog.
Configuration
Section titled “Configuration”| Setting | Type | Required | Default |
|---|---|---|---|
| Crawler name | Text | Yes | — |
| Description | Text | — | — |
| Target database | Text | Yes | — |
| Cron schedule | Text | — | — |
| IAM role ARN | Text | — | — |
| Table prefix | Text | — | — |
| S3 paths | List | — | — |
| JDBC connections | List | — | — |
| DynamoDB tables | List | — | — |
| Recrawl policy Options: Crawl everything, Crawl new folders only, Event-driven | Choice | — | CRAWL_EVERYTHING |
| Schema change update behavior Options: Update in database, Log only | Choice | — | UPDATE_IN_DATABASE |
| Schema change delete behavior Options: Log only, Delete from database, Deprecate in database | Choice | — | DEPRECATE_IN_DATABASE |
| Custom classifiers | List | — | — |
| Configuration (JSON) | Text | — | — |
| Tags | Key–value | — | — |
Connections
Section titled “Connections”| Socket | Direction | Accepts | Terraform arg |
|---|---|---|---|
| Source | Input | any | — |
| IAM role | Input | aws.iam-role | role |
| Data Catalog | Output | any | — |