dbtrail Documentation

How dbtrail's backup architecture works — base snapshots, continuous binlog streaming, and point-in-time recovery

dbtrail is your MySQL backup system. It combines full logical snapshots with continuous binary log streaming so every row change is captured, queryable, and restorable to any point in time within your retention window. This guide explains how the architecture works, how to configure it, and how it compares to a DIY mysqldump + cron setup.

This replaces your existing backup job

If you're running mysqldump + cron today, dbtrail replaces it. You get parallel logical dumps, continuous binlog streaming, per-row point-in-time recovery, and a queryable change history — in one system.

How dbtrail backs up your database

dbtrail takes two things and combines them into a complete backup system:

Full base snapshots — taken on a schedule with mydumper (parallel logical dumps) via bintrail dump. Each snapshot embeds the exact binlog position and GTID set at the time of the dump, and can be converted to Parquet and uploaded to S3 with bintrail baseline --upload.
Continuous binlog streaming — bintrail stream registers as a MySQL replica and captures every INSERT, UPDATE, and DELETE in real time. Events are indexed for instant query and archived as Parquet for long-term retention.

[Base snapshot T0] ─── continuous binlog stream ───> [Base snapshot T1] ─── stream ───> [now]
        │                          │                          │                   │
        │   Every change           │                          │   Every change    │
        │   captured as it         │                          │   captured as it  │
        │   happens                │                          │   happens         │
        │                          │                          │                   │
        └── Restore to any point ──┴── Restore to any point ──┴── Restore to any point

Together, these give you point-in-time recovery to any moment after the earliest available base snapshot — not just to the last nightly cron.

Why both pieces matter

A base snapshot by itself is stale the moment it's taken. A binlog stream by itself has no starting point — reconstructing state from a stream of deltas requires knowing what came before. dbtrail maintains both so you never have to think about the gap.

Snapshot frequency

More frequent snapshots = faster point-in-time recovery (less binlog to replay). A weekly snapshot is fine for most workloads; daily is a typical choice for production databases. Continuous streaming covers everything between snapshots.

Scheduled backups

Self-hosted, scheduling is plain cron or a systemd timer around the snapshot step. A weekly baseline at 2 AM Sunday:

# Weekly baseline dump at 2am Sunday
0 2 * * 0 root bintrail dump \
  --source-dsn "$SOURCE_DSN" \
  --output-dir /tmp/mydumper-weekly \
  && bintrail baseline \
  --input  /tmp/mydumper-weekly \
  --output /data/baselines \
  --upload s3://my-bucket/baselines/ \
  >> /var/log/bintrail-baseline.log 2>&1

For a hardened version with local-disk pruning, a systemd-timer alternative, and the PITR caveats, see the full recipe under Backups on BYOS below — it applies to any deployment where you own the host.

dbtrail Cloud

dbtrail Cloud manages this step for you. Create a schedule from Dashboard → Backups → New Schedule (cron expression, retention days, S3 prefix), watch run history with duration, size, and error details, trigger on-demand snapshots with POST /api/v1/backup before a risky migration, and get default retention by plan (Free 7 / Pro 30 / Premium 90 / Enterprise 365 days of change history; snapshot retention is set per schedule). See the Backups API reference for endpoints and the retention model.

How dbtrail implements the snapshot step

bintrail dump is a thin wrapper around mydumper: it validates inputs, takes a lockfile so only one dump runs at a time, and invokes mydumper with the right flags (--threads, --compress-protocol, --complete-insert). Mydumper splits the dump across multiple threads (default: 4), which is significantly faster than single-threaded mysqldump on large databases.

bintrail dump \
  --source-dsn "user:pass@tcp(source-db:3306)/" \
  --output-dir /tmp/mydumper-output

Snapshot-step options:

Zero-install mydumper — if no compiled mydumper binary is found on $PATH, bintrail dump automatically runs the official mydumper/mydumper Docker image. Pin a version with --mydumper-image mydumper/mydumper:v1.0.3-1, or force a specific binary with --mydumper-path.
Schema/table filtering — --schemas mydb,otherdb or --tables mydb.orders,mydb.items snapshot a subset instead of the entire server.
Compression — bintrail baseline writes Parquet with zstd compression by default (--compression zstd|snappy|gzip|none) to shrink S3 storage cost.

Note that --output-dir is removed and recreated on each run — don't point it at a directory containing files you want to keep.

bintrail baseline then converts the mydumper output to Parquet (one file per table) and can upload in the same step:

bintrail baseline \
  --input  /tmp/mydumper-output \
  --output /data/baselines \
  --upload s3://my-bucket/baselines/

The snapshot timestamp and binlog position come from mydumper's metadata file — that's what makes the dump usable as a PITR baseline. If a run fails partway through (network error during upload, disk full), re-run with --retry to skip tables and S3 objects that already completed.

dbtrail Cloud

On dbtrail Cloud, the agent on each instance runs this snapshot step for you on the schedule you configure — no cron required. Snapshots land under a tenant- and server-prefixed path in the managed archive bucket (s3://<bucket>/backups/<tenant>/<server-name>/2026-03-13/141500/), in the same bucket as the binlog event archives, with date-based directories for browsing and retention management. See the Backups API reference.

Backups on BYOS

BYOS — the In Your VPC deployment model of dbtrail Cloud — runs the agent in your network and writes Parquet binlog archives directly to an S3 bucket you own: you control the KMS key, bucket policy, and lifecycle; the control plane only sees the metadata index (table, event type, PK, timestamp) over the agent WebSocket, so dbtrail cannot read your row data; and you can revoke access at any time by changing the bucket policy — streaming and archiving continue locally on the agent. See the BYOS architecture and security overview for the full trust boundary, and the Local Agents guide for setup.

Because the control plane has no inbound path into your network, dbtrail Cloud does not run snapshots against your MySQL on BYOS. The hosted endpoints return 422 byos_not_supported:

$ curl -X POST https://api.dbtrail.com/api/v1/backup \
    -H "Authorization: Bearer bt_live_..." \
    -d '{"server_id": "..."}'

{"detail": "BYOS mode: backup operations are delegated to the customer's agent",
 "error_code": "byos_not_supported"}

Instead, you take snapshots locally by running mydumper directly on a host that can reach MySQL (typically the same host as the agent). The binlog stream the agent is already capturing covers everything between snapshots, so the end-to-end PITR story is the same as on Cloud — you just own the snapshot step.

The recipe below is not BYOS-specific — it's the same workflow a plain self-hosted (open-source) deployment uses; only the /etc/bintrail/agent.env credentials file and BINTRAIL_SERVER_ID variable are Cloud-agent-specific (substitute your own credential source and a server identifier of your choosing).

The model

[Local mydumper snapshot]   +   [agent's continuous binlog stream]   =   full PITR coverage
        │                                  │
        ↓                                  ↓
   your S3 bucket                     your S3 bucket (Parquet archives, agent-managed)

Both pieces live side-by-side in your S3 bucket.

One-off snapshot

Run this on any host with network access to your MySQL. It uses the replication user you already created for the agent (step 2 of Local Agents):

TS=$(date -u +%Y%m%dT%H%M%S)
DUMP_DIR=/var/backups/bintrail/$TS
mkdir -p "$DUMP_DIR"

mydumper \
  --host 127.0.0.1 --port 3306 \
  --user bintrail --password "$BINTRAIL_PASSWORD" \
  --outputdir "$DUMP_DIR" \
  --threads 4 \
  --regex '^(?!(mysql|sys|performance_schema|information_schema)\.)' \
  --sync-thread-lock-mode NO_LOCK

# Sync the whole dump directory — the metadata file matters for PITR (see below).
aws s3 sync "$DUMP_DIR" "s3://my-company-dbtrail/backups/<server-id>/$TS/"

Notes:

--sync-thread-lock-mode NO_LOCK skips LOCK INSTANCE FOR BACKUP / FLUSH TABLES WITH READ LOCK. dbtrail's binlog stream already provides consistency, so the lock isn't needed and avoids requiring BACKUP_ADMIN/RELOAD. (This matches what the hosted Cloud agent does.)
The --regex excludes MySQL system schemas. To dump only specific databases, replace it with e.g. --regex '^(app|analytics)\..*'.
aws s3 sync uploads the whole directory, including the metadata file. Do not sync just the .sql files — see the PITR note below.
On a plain self-hosted deployment you can use bintrail dump (see above) instead of invoking mydumper directly — it auto-resolves mydumper via Docker and handles locking against concurrent dumps, though on mydumper 0.18+ (including the default Docker image) it passes --sync-thread-lock-mode NO_LOCK automatically, falling back to default consistency locking only on older mydumper builds.

Scheduling with cron

Wrap the command above in /usr/local/bin/bintrail-backup.sh and schedule it:

sudo tee /usr/local/bin/bintrail-backup.sh > /dev/null <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

# Load credentials from the same env file the agent uses.
# shellcheck disable=SC1091
source /etc/bintrail/agent.env

TS=$(date -u +%Y%m%dT%H%M%S)
DUMP_DIR=/var/backups/bintrail/$TS
S3_PREFIX=s3://my-company-dbtrail/backups/${BINTRAIL_SERVER_ID}/$TS/

mkdir -p "$DUMP_DIR"

mydumper \
  --host "${MYSQL_HOST:-127.0.0.1}" --port "${MYSQL_PORT:-3306}" \
  --user "${MYSQL_USER:-bintrail}" --password "${MYSQL_PASSWORD}" \
  --outputdir "$DUMP_DIR" \
  --threads 4 \
  --regex '^(?!(mysql|sys|performance_schema|information_schema)\.)' \
  --sync-thread-lock-mode NO_LOCK

aws s3 sync "$DUMP_DIR" "$S3_PREFIX"

# Local retention: keep 3 most recent dumps on disk.
ls -1dt /var/backups/bintrail/*/ | tail -n +4 | xargs -r rm -rf

logger -t bintrail-backup "snapshot uploaded to $S3_PREFIX"
EOF

sudo chmod 750 /usr/local/bin/bintrail-backup.sh
sudo chown root:bintrail /usr/local/bin/bintrail-backup.sh

Then add a crontab entry (daily at 02:00 UTC):

# crontab -e
0 2 * * * /usr/local/bin/bintrail-backup.sh >> /var/log/bintrail-backup.log 2>&1

Or, if you prefer systemd timers:

# /etc/systemd/system/bintrail-backup.service
[Unit]
Description=Bintrail nightly snapshot

[Service]
Type=oneshot
ExecStart=/usr/local/bin/bintrail-backup.sh

# /etc/systemd/system/bintrail-backup.timer
[Unit]
Description=Run bintrail-backup nightly

[Timer]
OnCalendar=*-*-* 02:00:00 UTC
Persistent=true

[Install]
WantedBy=timers.target

sudo systemctl daemon-reload
sudo systemctl enable --now bintrail-backup.timer

PITR: keep the `metadata` file

After every run, mydumper writes a metadata file into the output dir that looks like:

Started dump at: 2026-05-19 02:00:01

SHOW MASTER STATUS:
        Log: mysql-bin.000148
        Pos: 197834521
        GTID:5d8e...:1-9421317

Finished dump at: 2026-05-19 02:01:44

This is the input to point-in-time recovery: it tells dbtrail where to start replaying binlog events from (bintrail baseline reads the dump timestamp and binlog position from it). The aws s3 sync above already uploads it because it copies the whole directory — but if you customize the upload step (e.g. piping only *.sql files through compression), make sure metadata still ends up next to the dump in S3. Without it, PITR can only restore to the dump-finish moment, not to a precise later target time.

See the PITR guide for how the snapshot + binlog stream combine on restore.

Encryption and retention

Encryption in transit/at rest — use a customer-managed KMS key on the destination bucket (SSE-KMS) so encryption is automatic on aws s3 sync. mydumper also supports --encrypt-key-file for on-disk encryption if you need defense-in-depth before upload.
Retention — set an S3 lifecycle policy on the backups prefix (e.g. transition to Glacier after 30 days, expire after 365). The script above already prunes local copies to bound disk usage.
Bucket layout — keep snapshots and the Parquet binlog archives in different prefixes of the same bucket so a lifecycle rule on /backups/ doesn't accidentally expire your binlog archives.

What's coming

A future SaaS-managed BYOS backup flow will let you register the bintrail-backup.sh step with dbtrail Cloud so the dashboard shows backup history, retention status, and integrates with the existing PITR UI. Until then, this is a customer-managed workflow.

Restore

The restore side of dbtrail is covered in two places depending on what you need:

Row-level recovery — undo a specific DELETE, revert an UPDATE, or remove an unintended INSERT. bintrail recover generates dry-run SQL you review before applying. See the recovery guide.
Point-in-time restore of whole tables or the whole database — reconstruct the full state at any past moment by combining a base snapshot with binlog events up to the target time (bintrail reconstruct --output-format mydumper). Output is a standard mydumper dump you can import anywhere. See the PITR guide.

dbtrail intentionally never executes writes against your MySQL — you always review and apply the output yourself. Self-hosted, both paths run from the CLI, and row-level recovery is also available in the web console and the bundled MCP server; on dbtrail Cloud, both are first-class dashboard flows.

Restoring with Claude

Ask Claude: "Undo the DELETE on user 12345" — it calls the dbtrail recover tool and shows you the generated SQL for review. This works with the open-source MCP server (bintrail-mcp) and the dbtrail Cloud MCP gateway alike. Full-table point-in-time restore is not an MCP operation in either edition — run bintrail reconstruct from the CLI, or use the dashboard/API on dbtrail Cloud. Claude is optional — the CLI, console, dashboard, and API provide the same operations directly.

Compared to DIY `mysqldump` + cron

If you're migrating from a cron-driven dump setup, here's what dbtrail replaces and adds:

Capability	`mysqldump` + cron	dbtrail
Scheduled full dumps	Manual script	`bintrail dump` + cron/systemd; managed schedules with status and run history on dbtrail Cloud
Parallel dump engine	Single-threaded	Mydumper, multi-threaded — auto-resolved via Docker if not installed
S3 upload	Manual (`aws s3 cp`)	Built in (`bintrail baseline --upload`, `bintrail upload`)
Retention enforcement	Manual cleanup script	Snapshots via S3 lifecycle policy; `bintrail rotate` prunes old change-history partitions (archiving them as Parquet); pruned automatically by plan on dbtrail Cloud
Point-in-time recovery	Manual binlog replay	One command (`bintrail reconstruct`) — dbtrail combines the baseline with binlog events up to your target time
Change history queries	—	Every row change indexed and queryable (SQL, console/dashboard, Claude)
Row-level recovery	—	Generates inverse SQL for a single row
Forensics (who changed what)	—	`who_changed`, `user_activity`, `connection_history` — a dbtrail Cloud feature
Monitoring	DIY	`bintrail status` + web console; dashboard and status API on dbtrail Cloud (alerting is DIY on top)
Data stays in your infrastructure	You already own everything	Self-hosted: always. dbtrail Cloud: the BYOS deployment keeps row data in your own S3 bucket

mysqldump still has its place for one-off exports, but the continuous-protection story it can't tell is where dbtrail lives.

Best practices

Snapshot at least as often as your binlog retention. If MySQL purges binlogs after 7 days, snapshot at least weekly so you always have a baseline inside the retention window.
Test your restores. A backup you've never restored is a backup you can't trust. Use the PITR guide to validate end-to-end against a staging MySQL periodically.
Monitor backup status. Check bintrail status and the web console (on dbtrail Cloud, the dashboard or status API) and wire them into your monitoring. A silently failing schedule is worse than no schedule — it creates a false sense of security.
Keep at least two retention cycles. If you snapshot weekly with 30-day retention, you'll always have ~4 good baselines. If one is corrupted, you have fallbacks.
Lock down your S3 bucket. Dumps are stored at rest in your backup bucket and may be accessed by multiple team members. Restrict the bucket to least-privilege IAM and enable SSE, preferably with a customer-managed KMS key.

Next steps

Point-in-time recovery — reconstruct your database to any past moment
Recovery guide — generate SQL to reverse specific row-level changes
Stream configuration — configure binlog streaming, filtering, and checkpoints
Server configuration — connect dbtrail Cloud to your MySQL database
Backups API reference — dbtrail Cloud endpoint details for programmatic scheduling

Backup Strategy