On this page

ETL | ELT - Validation Rules

How ETLX enables declarative data quality validation as a first-class concern of ETL / ELT execution blocks.

Validation Rules

ETLX allows data quality validation to be declared as metadata, making validation a first-class concern of ETL / ELT execution.

Validations are executed as part of the pipeline lifecycle, not as an external check or post-process.

They allow you to:

Fail fast on invalid data
Guard against silent data corruption
Encode business and technical expectations declaratively
Keep validation logic close to the data movement it protects

Declaring Validations

Validations are defined using the metadata key:

  <step>_validation

Where <step> can be any supported execution step, such as:

extract
transform
load

Example

In the previous example from etl/elt, we define two validation rules for the load step:

  ...
```yaml
...
load_validation:
  - type: throw_if_empty
    sql: validate_data_not_empty
    msg: "The given date is not avaliable in the origin!"
    active: true
  - type: throw_if_not_empty
    sql: validate_date_exists_in_destination
    msg: "This date is already imported, check to avoid duplicates!"
    active: true
...
```
```sql validate_data_not_empty
SELECT *
FROM PG.INPUT_1
WHERE event_date = '{YYYY-MM-DD}'
LIMIT 10
```
```sql validate_date_exists_in_destination
SELECT *
FROM DL.INPUT_1 
WHERE event_date = '{YYYY-MM-DD}'
LIMIT 10
```

Each validation rule is evaluated after the step SQL executes and before the next lifecycle hook.

here we used {YYYY-MM-DD} as a placeholder for date, which ETLX will replace at runtime, if cli used with --date argument, that date would be the one placede in {YYYY-MM-DD}.

Validation Execution Model

For each object defined in <step>_validation:

The associated sql is executed
The query result is evaluated according to the type
If the condition is met, ETLX:
- Throws a controlled execution error
- Emits the configured message (msg)
- Marks the step as failed in execution metadata

Validation Connection Resolution

Validation SQL is executed using:

<step>_conn, if defined
Otherwise, the pipeline’s default connection

This ensures validations run in the same execution context as the data they validate.

Supported Validation Types

Type	Behavior
`throw_if_empty`	Fails if the query returns zero rows
`throw_if_not_empty`	Fails if the query returns one or more rows

Validation types are intentionally simple and composable. Complex rules should be expressed in SQL, not in configuration logic. This keeps validation declarative and database-native.

SQL Validation Design

Validation SQL should express intent, not mechanics.

Typical validation queries include:

Row existence checks
Duplicate detection

This is mostly usefull to avolid data duplication or missing data during incremental loads, or if needed to run the extraction more than once because of some input failure, without compromising inputs that were already extracted.

Validation Observability

Each validation execution is recorded as part of ETLX’s observability layer:

Validation name
Step and pipeline association
SQL executed
Execution time
Result (pass / fail)
Error message (if thrown)

This enables:

Auditing
SLA enforcement
Debugging
Reporting

Design Rationale

ETLX treats validation as:

Executable documentation
A contract between pipeline stages
A guardrail, not a sidecar

By encoding validation rules in metadata:

Pipelines become safer by default
Failures are explicit and explainable
Validation logic evolves with the pipeline

Key Takeaways

Validation is declarative, explicit, and observable
SQL remains the single source of truth
Pipelines fail loudly, not silently
Unsupported sources do not block adoption
ETLX prioritizes correctness over convenience

Edit this page

Last updated 06 Jan 2026, 20:05 -01 . history

ETL | ELT

How ETLX models, executes, and …

Query Documentation

How ETLX models SQL queries as …

ETL | ELT - Validation Rules

Validation Rules link

Declaring Validations link

Example link

Validation Execution Model link

Validation Connection Resolution link

Supported Validation Types link

SQL Validation Design link

Validation Observability link

Design Rationale link

Key Takeaways link