Build Daily

Tinley Park · May 29, 2026

Apache Airflow — what it is and how we use it on paiddaily.io

Apache Airflow is the batch scheduler behind paiddaily.io. It runs 44 DAGs on cron schedules — price refreshes, pool syncs, market snapshots, ticker research, scoring jobs — so the API stays focused on serving reads.

What Airflow is

Airflow is an open-source platform for scheduling and monitoring workflows. You define a workflow as a Python DAG (directed acyclic graph), set a cron schedule, and Airflow runs it. Each task in the DAG runs in its own process. Retries, dependencies, failure callbacks, backfills — all built in.

It's not a streaming engine or a task queue. It's a scheduler that knows about order, timing, and failure. Apache 2.0 license, self-hosted on Docker Compose. There are managed offerings (Astronomer, GCP Cloud Composer), but we run it ourselves alongside the rest of the stack.

How paiddaily.io uses it

paiddaily.io is a consumer SaaS for full-time income traders. The dashboard shows DeFi positions, yield opportunities, market snapshots, and scoring analytics across Aerodrome, Pendle, and Boros. All of that data needs to stay fresh — some on 5-minute intervals, some daily.

Airflow runs 44 DAGs that keep the data current. The API (FastAPI) reads from Postgres. Airflow writes to Postgres. Clean split.

The DAGs

A sample of what's running:

  • boros_snapshot_markets — every 5 minutes, fetches market snapshots from the Boros API, computes forecasts with catalyst awareness, archives to bronze, upserts the processed snapshot
  • call_of_the_day — daily at 08:30 ET, gathers candidates from Pendle and Boros, scores them with a multi-factor model, records the winner and runners-up
  • pendle_market_research — runs a DSPy module against Pendle markets, persists the research output for the Edge opportunity surface
  • ticker_research_worker — enriches tickers with Pendle appearances, market cap, and sector data
  • aero_pool_state_sync — syncs on-chain pool state from Aerodrome's contracts on Base
  • tastytrade_positions_sync — pulls positions from the TastyTrade brokerage API

Each DAG has retry logic, execution timeouts, and a Telegram failure callback so nothing fails silently.

Two patterns

The 44 DAGs follow two patterns.

Direct domain calls

Sixteen DAGs call the same async domain functions the API uses, through a sync wrapper called run_domain:

from gl_paiddaily_common.domain_runner import run_domain

@task
def run_picker() -> dict:
    return run_domain(
        "features.common.jobs.call_of_the_day",
        "run",
    )

run_domain initializes an asyncpg connection pool, calls the async function, tears down. Same business logic, separate process. This is the default pattern when the domain code is already well-factored.

Inline pipelines

Eleven DAGs define the full pipeline in the DAG file itself — fetch from an external API, transform, write to Postgres. The Boros snapshot DAG is a good example: it reads active markets, fetches from the Boros API, computes forecasts, writes raw data to a bronze archive, and upserts the snapshot. The logic is pipeline-shaped rather than domain-shaped, so it lives in the DAG.

Both patterns end the same way: data lands in Postgres, the API reads it.

When Airflow earns its keep

Airflow is real infrastructure — a scheduler process, a webserver, a metadata database. At 3-5 jobs, a systemd timer or a cron + shell script is less to maintain.

The decision point: when the number of scheduled jobs exceeds what you can hold in your head, and when failure alerting and retry logic per job actually matters. Below that line, cron is fine. Above it, Airflow's UI, backfill support, and per-task monitoring are worth the operational surface.

For paiddaily.io, the line was crossed somewhere around job 15. By job 44, it's the obvious choice.

Adding a new job

  1. Write the domain function in features/. Pure async Python, returns a result.
  2. Create a DAG in airflow/dags/. Use @dag and @task decorators. Call run_domain() from the task.
  3. Set the cron schedule.
  4. Wire failure alerting with on_failure_callback=telegram_on_failure.

Add a file, set a cron, done.

  • #airflow
  • #fastapi
  • #architecture
  • #paiddaily
  • #building-in-public

Continue reading