✅ Requirements

Depending on how you install ETLX:

Minimum

  • Linux, macOS, or Windows
  • DuckDB-compatible environment

Optional (for building from source)

  • Go ≥ 1.21
  • git

📦 Installation

Choose one of the following options.


Download the latest release for your OS from:

👉 https://github.com/realdatadriven/etlx/releases

Make it executable and verify:

  chmod +x etlx
./etlx --help
  

🪟 Windows & DuckDB Extensions (Important Note)

Some DuckDB extensions do not support MinGW on Windows. For this reason, ETLX provides two Windows binaries:

  1. Statically linked DuckDB (default)
  2. Dynamically linked DuckDB (recommended when using more extensions like postgres)

If you download the dynamically linked ETLX binary:

Otherwise, ETLX will not be able to load DuckDB or its extensions.

💡 This approach allows ETLX to support a wider set of DuckDB extensions on Windows, while keeping the runtime flexible and lightweight.


When should I use the dynamic DuckDB binary?

Use the dynamic DuckDB build if you:

  • Are on Windows
  • Rely on DuckDB extensions not available for MinGW
  • Want closer compatibility with upstream DuckDB releases

For Linux and macOS users, the default precompiled binary usually works without additional setup.


Option 2: Install via Go

If you want ETLX as a Go dependency or to build it yourself:

  go get github.com/realdatadriven/etlx
  

Option 3: Clone the Repository

  git clone https://github.com/realdatadriven/etlx.git
cd etlx
  

Run directly:

  go run cmd/main.go --config pipeline.md
  

⚠️ Windows note If you encounter DuckDB build issues, use the official DuckDB library and build with:

  CGO_ENABLED=1 CGO_LDFLAGS="-L/path/to/libs" \
go run -tags=duckdb_use_lib cmd/main.go --config pipeline.md
  

🧱 Your First Pipeline

ETLX pipelines are defined using structured Markdown.

Create a file named pipeline.md:

  # INPUTS
```yaml
name: INPUTS
description: Extracts data from source and load on target
runs_as: ETL
active: true
```

## INPUT_1
```yaml
name: INPUT_1
description: Input 1 from an ODBC Source
table: INPUT_1 # Destination Table
load_conn: "duckdb:"
load_before_sql:
  - "ATTACH 'ducklake:@DL_DSN_URL' AS DL (DATA_PATH 's3://dl-bucket...')"
  - "ATTACH '@OLTP_DSN_URL' AS PG (TYPE POSTGRES)"
load_sql: load_input_in_dl
load_on_err_match_patt: '(?i)table.+with.+name.+(\w+).+does.+not.+exist'
load_on_err_match_sql: create_input_in_dl
load_after_sql:
  - DETACH DL
  - DETACH pg
active: true
```

```sql
-- load_input_in_dl
INSERT INTO DL.INPUT_1 BY NAME
SELECT * FROM PG.INPUT_1
```

```sql
-- create_input_in_dl
CREATE TABLE DL.INPUT_1 AS
SELECT * FROM PG.INPUT_1
```
...
  

▶️ Run the Pipeline

  etlx --config pipeline.md
  

That’s it.

ETLX will:

  • Parse the configuration
  • Resolve dependencies
  • Execute steps deterministically
  • Capture execution metadata automatically

⚙️ Common CLI Flags

FlagDescription
--configPath to pipeline file (default: config.md)
--dateReference date (YYYY-MM-DD)
--onlyRun only specific keys
--skipSkip specific keys
--stepsRun specific steps (extract, transform, load)
--cleanRun clean_sql blocks
--dropRun drop_sql blocks
--rowsShow row counts

Example:

  etlx --config pipeline.md --only sales --steps extract,load
  

🔐 Environment Variables

ETLX supports environment-based configuration.

Example .env file:

  DL_DSN_URL=mysql:db=ducklake_catalog host=localhost
OLTP_DSN_URL=postgres:dbname=erpdb host=localhost user=postgres
  

These variables are automatically loaded at runtime.


🐳 Running ETLX with Docker

You can run ETLX without installing anything locally.

Build the Image

  docker build -t etlx:latest .
  

Or pull (when available):

  docker pull docker.io/realdatadriven/etlx:latest
  

Run a Pipeline

  docker run --rm \
  -v $(pwd)/pipeline.md:/app/pipeline.md:ro \
  etlx:latest --config /app/pipeline.md
  

Using .env and Database Directory

  docker run --rm \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/pipeline.md:/app/pipeline.md:ro \
  -v $(pwd)/database:/app/database \
  etlx:latest --config /app/pipeline.md
  

Interactive Mode

  docker run -it --rm etlx:latest repl
  

💡 Optional: Docker Alias

Make Docker feel like a native binary:

  alias etlx="docker run --rm -v $(pwd):/app etlx:latest"
  

Now:

  etlx --help
etlx --config pipeline.md
  

🧠 What’s Next?

  • 📘 Core Concepts – Pipelines, steps, metadata
  • 🔍 Execution & Observability – What ETLX records automatically
  • 🧾 Self-Documenting Pipelines
  • 🧬 Metadata → Lineage → Governance
  • 🧩 Advanced Use Cases & Examples

👉 Continue with Core Concepts to understand how ETLX works under the hood.

Last updated 09 Jan 2026, 11:17 -01 . history

Was this page helpful?