dbtrail Documentation

Why full database dumps matter for binlog-based change tracking and how dbtrail implements them

dbtrail tracks every row-level change by reading MySQL binary logs (binlogs). But binlogs alone aren't enough to protect your data. This guide explains why periodic full dumps are essential, where they help in practice, and how dbtrail's backup system works.

Why binlogs alone aren't enough

Binary logs record incremental changes — every INSERT, UPDATE, and DELETE as it happens. They're a stream of deltas, not a picture of the database. This creates three problems:

No starting point. If all you have is a stream of changes, you can't reconstruct the database without knowing what it looked like before those changes started. A binlog that says "row 42 was updated from X to Y" is useless if you don't know the full state at the time streaming began.
MySQL purges old binlogs. The expire_logs_days (or binlog_expire_logs_seconds) setting controls how long MySQL keeps binlog files. Once they're purged, that history is gone forever — there's no way to recover it.
Gaps are permanent. If streaming is interrupted (network issues, server restarts, agent downtime) and the binlogs covering that gap have been purged by the time streaming resumes, those changes are lost. There's no way to replay what happened during the gap.

A full dump solves all three problems by creating a checkpoint — a complete snapshot of the database at a known point in time.

The relationship between dumps and binlogs

Think of it like accounting:

A full dump is a balance sheet — the complete state of every account at a specific date
Binlogs are the journal entries — every individual transaction that happened since the last balance sheet

Neither is useful alone. The balance sheet without journal entries is stale the moment it's taken. Journal entries without a balance sheet have no starting point. Together, they give you a complete, auditable history.

How they work together

[Full Dump T0] ─── binlog stream ───> [Full Dump T1] ─── binlog stream ───> [now]
     │                                      │
     │  Every change between T0 and T1      │  Every change between T1 and now
     │  is captured in the binlog stream     │  is captured in the binlog stream
     │                                      │
     └── Can restore to any point ──────────└── Can restore to any point
         between T0 and T1                      between T1 and now

Periodic dumps establish known-good states. Binlogs fill in everything between checkpoints. Together they enable point-in-time recovery to any moment after the earliest available dump.

Dump frequency

How often you take full dumps depends on your recovery requirements and binlog retention. A common starting point: dump weekly if your binlog retention is 7+ days. This ensures you always have at least one dump within the binlog retention window.

Where full dumps are helpful

1. Point-in-time recovery (PITR)

The most common use case. Imagine you accidentally dropped a table on Tuesday at 2:15 PM. With a full dump from Monday night and the binlog stream covering Monday night through Tuesday afternoon, you can:

Restore Monday night's dump
Replay binlogs up to Tuesday at 2:14 PM — one minute before the accident

Without the dump, you'd have no base state to replay onto.

2. Catastrophic recovery

Sometimes surgical recovery (undoing a few specific rows via binlog analysis) isn't enough. If a large-scale corruption or accidental mass deletion affects many tables, you need the entire database restored to a known-good state. That requires a full dump.

3. Binlog expiration safety net

MySQL will eventually purge old binlog files. If your binlog retention is 7 days and something goes wrong on day 8, the binlogs covering the incident may already be gone. A full dump taken within the retention window guarantees you can still recover — you're never more than one dump interval away from a valid restore point.

4. New replica bootstrapping

Setting up a new MySQL replica requires two things: a consistent snapshot of the data, and the exact binlog position that snapshot corresponds to. A full dump provides both. The replica loads the snapshot, then starts replication from the recorded binlog position to catch up to the primary.

5. Drift detection and verification

Over time, databases can drift from their expected state — manual edits, replication bugs, or schema changes that bypass normal workflows. By comparing the current database state against the last dump plus all applied binlogs since that dump, you can detect untracked changes that didn't come through the normal binlog stream.

6. Compliance and auditing

Many regulatory frameworks require periodic snapshots alongside change logs. Full dumps at regular intervals, combined with the continuous binlog stream, create verifiable checkpoints that auditors can use to confirm data integrity at specific points in time.

7. Migration and cloning

Full dumps are the standard way to seed new environments. Need a staging environment that mirrors production? Take a dump and restore it. Migrating to a new host? Dump, transfer, restore, then point replication at the new binlog position.

How dbtrail implements backups

dbtrail uses a two-step workflow: dump the database to local disk, then upload the dump files to S3. This separation is intentional:

Re-upload without re-dumping — if the S3 upload fails (network blip, permission issue), you can retry the upload without repeating the expensive dump
Schedule independently — dump and upload can run on different schedules if needed
Dump without S3 — useful for local-only backups or testing

The dump step

dbtrail uses mydumper for parallel logical backups. The Go agent on each EC2 instance (which also runs the bintrail CLI for streaming and recovery) invokes mydumper directly for full control over dump options and error handling. Mydumper splits the dump across multiple threads (default: 4), which is significantly faster than single-threaded mysqldump on large databases.

Options include:

Schema/table filtering — dump specific schemas or tables instead of the entire server
Encryption — encrypt dump files at rest using OpenSSL
Thread count — tune parallelism based on your server's capacity

The upload step

After dumping, files are uploaded to S3 with tenant-prefixed paths:

s3://bintrail-backups/<tenant>/<server-name>/2026-03-13/141500/

This structure keeps backups organized per tenant and server, with date-based directories for easy browsing and retention management.

Scheduling and retention

From the dbtrail dashboard or API, you can create backup schedules with:

Setting	Description	Default
Cron expression	When to run (e.g., `0 2 * * *` for 2 AM daily)	—
Backup tool	`mydumper` or `xtrabackup`	`mydumper`
Retention	How many days to keep backups (1–365)	30 days
S3 prefix	Custom path prefix within the tenant's S3 namespace	Auto-generated

Schedules can be enabled or disabled without deleting them. The dashboard shows last run status and backup history with duration, size, and error details.

On-demand backups

In addition to scheduled backups, you can trigger a backup at any time from the dashboard or via the API:

curl -X POST https://api.dbtrail.com/api/v1/backup \
  -H "Authorization: Bearer bt_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"server_id": "your-server-uuid"}'

The backup runs asynchronously — the API returns immediately with a backup_id you can use to track progress in the backup history.

Best practices

Dump at least as often as your binlog retention period. If MySQL purges binlogs after 7 days, dump at least weekly. This ensures you always have a dump within the binlog window.
Test your restores. A backup you've never restored is a backup you can't trust. Periodically restore a dump to a test environment to verify it works.
Monitor backup status. Check the backup history in the dashboard regularly. A silently failing backup schedule is worse than no schedule — it creates a false sense of security.
Keep at least two retention cycles. If you dump weekly with 30-day retention, you'll always have ~4 good snapshots. If one is corrupted, you have fallbacks.
Encrypt sensitive dumps. If your database contains PII or other sensitive data, enable encryption on dump files. They're stored at rest in S3 and may be accessed by multiple team members.

Next steps

Recovery guide — how to use binlog-based change tracking to generate recovery SQL
Stream configuration — configure binlog streaming, filtering, and checkpoints
Server configuration — connect dbtrail to your MySQL database

Backup Strategy