← Back

Airflow DAG Generator

Schoolytics runs scheduled data syncs for every district. Grades from PowerSchool overnight, assignment data from Google Classroom hourly, state assessment results weekly, and so on. Hand-writing DAGs per-district × per-integration doesn't scale, and every district's timezone, schedule, and integration set is different. This is the meta-DAG that generates the fleet.

How it works

A single meta-DAG runs nightly. It queries Cloud Spanner for the current set of active customers and their integration configs, then emits per-customer DAG Python files directly to the Composer GCS bucket.

Why a meta-DAG

Hierarchical orgs

Customers can have child orgs with roll-up behavior, a parent district's warehouse is populated by queries that union IDs across all children. The generator knows about the parent-child relationships and produces a parent DAG that correctly includes child customers in aggregations.

Stack