Actions
Define file operations and external transfers in ETLX workflows
Actions
There are scenarios in ETL workflows where actions such as downloading, uploading, compressing or copying files cannot be performed using SQL alone. The ACTIONS section allows you to define steps for copying or transferring files using the file system or external protocols.
ACTIONS Structure
Each action under the ACTIONS section has the following:
name: Unique name for the action.description: Human-readable explanation.type: The kind of action to perform. Options:copy_filecompressdecompressftp_downloadftp_uploadsftp_downloadsftp_uploadhttp_downloadhttp_uploads3_downloads3_uploaddb_2_db
params: A map of input parameters required by the action type.
# ACTIONS
```yaml metadata
name: FileOperations
description: "Transfer and organize generated reports"
path: examples
active: true
```
## COPY LOCAL FILE
```yaml metadata
name: CopyReportToArchive
description: "Move final report to archive folder"
type: copy_file
params:
source: "/reports/final_report.xlsx"
target: "/reports/archive/final_report_YYYYMMDD.xlsx"
active: true
```
## Compress to ZIP
```yaml metadata
name: CompressReports
description: "Compress report files into a .zip archive"
type: compress
params:
compression: zip
files:
- "reports/report_1.csv"
- "reports/report_2.csv"
output: "archives/reports_YYYYMM.zip"
active: true
```
## UNZIP
```yaml metadata
name: CompressReports
description: "Compress report files into a .zip archive"
type: decompress
params:
compression: zip
input: "archives/reports_YYYYMM.zip"
output: "tmp"
active: true
```
## Compress to GZ
```yaml metadata
name: CompressToGZ
description: "Compress a summary file to .gz"
type: compress
params:
compression: gz
files:
- "reports/summary.csv"
output: "archives/summary_YYYYMM.csv.gz"
active: true
```
## HTTP DOWNLOAD
```yaml metadata
name: DownloadFromAPI
description: "Download dataset from HTTP endpoint"
type: http_download
params:
url: "https://api.example.com/data"
target: "data/today.json"
method: GET
headers:
Authorization: "Bearer @API_TOKEN"
Accept: "application/json"
params:
date: "YYYYMMDD"
limit: "1000"
active: true
```
## HTTP UPLOAD
```yaml metadata
name: PushReportToWebhook
description: "Upload final report to an HTTP endpoint"
type: http_upload
params:
url: "https://webhook.example.com/upload"
method: POST
source: "reports/final.csv"
headers:
Authorization: "Bearer @WEBHOOK_TOKEN"
Content-Type: "multipart/form-data"
params:
type: "summary"
date: "YYYYMMDD"
active: true
```
## FTP DOWNLOAD
```yaml metadata
name: FetchRemoteReport
description: "Download data file from external FTP"
type: ftp_download
params:
host: "ftp.example.com"
port: "21"
user: "myuser"
password: "@FTP_PASSWORD"
source: "/data/daily_report.csv"
target: "downloads/daily_report.csv"
active: true
```
## FTP DOWNLOAD GLOB
```yaml metadata
name: FetchRemoteReport2024
description: "Download data file from external FTP"
type: ftp_download
params:
host: "ftp.example.com"
port: "21"
user: "myuser"
password: "@FTP_PASSWORD"
source: "/data/daily_report_2024*.csv"
target: "downloads/"
active: true
```
## SFTP DOWNLOAD
```yaml metadata
name: FetchRemoteReport
description: "Download data file from external SFTP"
type: stp_download
params:
host: "sftp.example.com"
user: "myuser"
password: "@SFTP_PASSWORD"
host_key: ~/.ssh/known_hosts # or a specific file
port: 22
source: "/data/daily_report.csv"
target: "downloads/daily_report.csv"
active: true
```
## S3 UPLOAD
```yaml metadata
name: ArchiveToS3
description: "Send latest results to S3 bucket"
type: s3_upload
params:
AWS_ACCESS_KEY_ID: '@AWS_ACCESS_KEY_ID'
AWS_SECRET_ACCESS_KEY: '@AWS_SECRET_ACCESS_KEY'
AWS_REGION: '@AWS_REGION'
AWS_ENDPOINT: 127.0.0.1:3000
S3_FORCE_PATH_STYLE: true
S3_DISABLE_SSL: false
S3_SKIP_SSL_VERIFY: true
bucket: "my-etlx-bucket"
key: "exports/summary_YYYYMMDD.xlsx"
source: "reports/summary.xlsx"
active: true
```
## S3 DOWNLOAD
```yaml metadata
name: DownalodFromS3
description: "Download file S3 from bucket"
type: s3_download
params:
AWS_ACCESS_KEY_ID: '@AWS_ACCESS_KEY_ID'
AWS_SECRET_ACCESS_KEY: '@AWS_SECRET_ACCESS_KEY'
AWS_REGION: '@AWS_REGION'
AWS_ENDPOINT: 127.0.0.1:3000
S3_FORCE_PATH_STYLE: true
S3_DISABLE_SSL: false
S3_SKIP_SSL_VERIFY: true
bucket: "my-etlx-bucket"
key: "exports/summary_YYYYMMDD.xlsx"
target: "reports/summary.xlsx"
active: true
```
📥 ACTIONS – db_2_db (Cross-Database Write)
As of this moment, DuckDB does not support direct integration with certain databases like MSSQL, DB2, or Oracle, the same way it does with SQLite, Postgres, or MySQL.
To bridge this gap, the db_2_db action type allows you to query data from one database (source) and write the results into another (target), using ETLX’s internal execution engine (powered by sqlx or ODBC).
✅ Use Case
Use db_2_db when:
- Your database is not accessible with DuckDB.
- You want to move data from one place to another using pure SQL, chunked if necessary.
🧩 Example
...
## WRITE_RESULTS_MSSQL
```yaml metadata
name: WRITE_RESULTS_MSSQL
description: "MSSQL example – moving logs into a SQL Server database."
type: db_2_db
params:
source:
conn: sqlite3:database/HTTP_EXTRACT.db
before: null
chunk_size: 1000
timeout: 30
sql: origin_query
after: null
target:
conn: mssql:sqlserver://sa:@MSSQL_PASSWORD@localhost?database=master&connection+timeout=30
timeout: 30
before:
- create_schema
sql: mssql_sql
after: null
active: true
```
```sql
-- origin_query
SELECT "description", "duration", STRFTIME('%Y-%m-%d %H:%M:%S', "start_at") AS "start_at", "ref"
FROM "etlx_logs"
ORDER BY "start_at" DESC
```
```sql
-- create_schema
IF NOT EXISTS (SELECT * FROM sys.tables WHERE name = 'etlx_logs' AND type = 'U')
CREATE TABLE [dbo].[etlx_logs] (
[description] NVARCHAR(MAX) NULL,
[duration] BIGINT NULL,
[start_at] DATETIME NULL,
[ref] DATE NULL
);
```
```sql
-- mssql_sql
INSERT INTO [dbo].[etlx_logs] ([:columns]) VALUES
```
🛠️ Notes
- You can define
beforeandafterSQL on bothsourceandtargetsides. [:columns]will be automatically expanded with the column list.- Data is inserted in chunks using the provided
chunk_size. - Compatible with any driver supported by
sqlxor databse tahat has an ODBC driver.
📝 Note: All paths and dynamic references (like
YYYYMMDD) are replaced at runtime by the refered date.
You can use environmental variables via@ENV_NAME.
⚠️ Note on S3 Configuration
When using s3_upload or s3_download, ETLX will look for the required AWS credentials and config in the parameters you provide in your ACTIONS block, such as:
AWS_ACCESS_KEY_ID: '@AWS_ACCESS_KEY_ID'
AWS_SECRET_ACCESS_KEY: '@AWS_SECRET_ACCESS_KEY'
AWS_REGION: '@AWS_REGION'
AWS_ENDPOINT: '127.0.0.1:3000'
S3_FORCE_PATH_STYLE: true
S3_DISABLE_SSL: false
S3_SKIP_SSL_VERIFY: true
🧠 If these parameters are not explicitly defined, ETLX will fall back to the system’s environment variables with the same names. This allows for better compatibility with tools like AWS CLI, Docker secrets, and
.envfiles.
This behavior ensures flexible support for local development, staging environments, and production deployments where credentials are injected at runtime.
⚠️ Security Warning: User-Defined Actions
❗ Dangerous if misused
Allowing users to define or influenceACTIONS(e.g. file copy, upload, or download steps) introduces potential security risks such as:
- Arbitrary file access or overwrite
- Sensitive file exposure (e.g.
/etc/passwd)- Remote execution or data exfiltration
🔐 Best Practices
- Restrict file paths using whitelists (
AllowedPaths) or path validation.- Never accept unvalidated user input for action parameters like
source,target, orurl.- Use readonly or sandboxed environments when possible.
- Log and audit every
ACTIONSblock executed in production.📌 If you’re using ETLX as a library (Go or Python), you must sanitize and scope what the runtime has access to.
Last updated 06 Jan 2026, 20:05 -01 .