dbtrail

Troubleshooting

Common issues and how to resolve them

Server registration

"Server limit reached"

Your plan's server limit has been reached. Upgrade your plan from Dashboard → Settings → Billing to register more servers.

PlanMax servers
Free1
Pro5
Premium20
EnterpriseUnlimited

Server stuck in "pending" status

The server is waiting for EC2 assignment. This usually resolves within a few minutes. If it persists:

  1. Check the server's connection details are correct
  2. Verify the MySQL host is reachable from the dbtrail VPC
  3. Contact support if the status doesn't change after 10 minutes

SSH tunnel Connection Failed

For SSH tunnel servers, if the jumphost firewall is not configured to allow dbtrail's outbound IP, the connection will fail and a Connection Failed modal will appear:

  1. Note the outbound IP shown in the modal (currently 44.237.184.31)
  2. Add an inbound rule to your jumphost's security group: allow TCP port 22 from that IP
  3. Click Retry in the modal to resume onboarding

Server shows "error" status

An unrecoverable error occurred during provisioning. Common causes:

  • MySQL host is unreachable
  • Invalid MySQL credentials
  • MySQL user lacks required privileges (REPLICATION SLAVE, REPLICATION CLIENT)
  • Binary logging is not enabled (log_bin = OFF)

Duplicate byos-N records

You pre-registered a server (e.g. bintrail-demo-tiny) via the dashboard or POST /api/v1/servers, then started the agent with a numeric --server-id <N>. Now your dashboard shows two cards for the same MySQL host:

  • The pre-registered card (bintrail-demo-tiny) sits at 0 events with onboarding_step=waiting_for_agent.
  • A duplicate card named byos-<N> (e.g. byos-202) accumulates the real event count.

This happens because the agent's first metadata batch hit ingest._auto_register_server before any binding to the pre-registered record was established. The fix has two parts.

1. Prevent recurrence (going forward)

Add --server-uuid <UUID> to the agent's systemd unit. The UUID is the pre-registered server's id from the dashboard URL (/app/servers/<UUID>) or GET /api/v1/servers:

ExecStart=/usr/local/bin/bintrail agent \
    --api-key bt_live_… \
    --endpoint wss://api.dbtrail.com/v1/agent \
    --server-id 202 \
    --server-uuid 183819c0-5742-400a-b1ae-cfb4f63dbdd4 \
    --source-dsn …

Reload and restart: sudo systemctl daemon-reload && sudo systemctl restart bintrail-agent. From this connect onward the WS handshake binds to your pre-registered record.

`--server-id` is numeric only

Do not try passing the UUID to --server-id. That flag is a uint and the agent will fail with strconv.ParseUint: invalid syntax on startup. Use --server-uuid (separate flag).

2. Clean up existing duplicates

The WS handshake fix doesn't migrate the event history. To redirect new events from byos-<N> to your pre-registered record, decommission the duplicate and reassign its bintrail_id:

BEGIN;

-- Decommission the duplicate so it stops matching ingest lookups
UPDATE <tenant_schema>.registered_servers
SET status = 'decommissioned'
WHERE name = 'byos-202';

-- Reassign the bintrail_id to the pre-registered record
-- (the partial unique index allows this once byos-202 is decommissioned)
UPDATE <tenant_schema>.registered_servers
SET bintrail_id = '202'
WHERE name = 'bintrail-demo-tiny';

COMMIT;

The uq_registered_servers_bintrail_id index is partial on WHERE status != 'decommissioned', so the decommission + reassign is safe in one transaction.

Within seconds the next POST /v1/events batch matches the pre-registered record (via _resolve_server's bintrail_id fallback) and events start accumulating in the correct bt_<server_uuid> database on the metadata EC2.

The duplicate's historical events stay in their original bt_<duplicate_uuid> database — they're orphaned but not destroyed. Run DROP DATABASE bt_<duplicate_uuid> on the metadata EC2 to reclaim disk space when you're ready.

Background: see BYOS Architecture → Server identity.

Stream issues

Stream not capturing changes

  1. Verify the stream is active: Servers → your server → Status
  2. Check that binary logging is enabled and uses row format:
    SHOW VARIABLES LIKE 'log_bin';        -- Must be ON
    SHOW VARIABLES LIKE 'binlog_format';  -- Must be ROW
  3. Confirm the bintrail MySQL user has replication privileges
  4. Check if schema/table filters are excluding the tables you expect

Missing or incomplete changes for specific tables

If some tables show no changes or unreliable data while others work fine, check that those tables use InnoDB and have a primary key:

-- Check a specific table's engine and keys
SHOW CREATE TABLE your_schema.your_table;
  • Non-InnoDB tables (e.g., MyISAM) are non-transactional — failed or interrupted statements can produce partial row events that dbtrail cannot reliably interpret. Convert them: ALTER TABLE your_schema.your_table ENGINE=InnoDB;
  • Tables without a primary key require full table scans for row identification during binlog processing and may be ambiguous when duplicate rows exist. Add a primary key: ALTER TABLE your_schema.your_table ADD PRIMARY KEY (your_column); (replace your_column with the column or columns that uniquely identify each row).

Caution: ALTER TABLE on large tables can lock the table and take significant time. Use an online schema change tool like pt-online-schema-change or gh-ost for production databases.

See the Quick Start prerequisites for queries that find all affected tables at once.

Stream shows "disconnected"

The agent lost the MySQL connection. It will auto-retry. Common causes:

  • MySQL server restart
  • Network interruption
  • MySQL connection timeout

The stream will resume from its last checkpoint when the connection is restored.

Authentication

"Invalid API key" error

  • Verify the key hasn't been revoked (check Dashboard → Settings → API Keys)
  • Check for extra whitespace or missing characters in the key
  • Ensure the key format matches: bt_live_<32 hex chars>
  • Confirm the key hasn't expired

"Not a member of this tenant" error

The authenticated user is not a member of the tenant specified in the X-Tenant-ID header. This can happen if:

  • You're using the wrong tenant ID
  • Your membership was removed by an admin
  • You're using a JWT (not an API key) and omitted the X-Tenant-ID header

Claude connection

Tools not appearing

  1. Verify your API key is valid and not revoked
  2. Check the connection URL: https://api.dbtrail.com/mcp
  3. Ensure the Authorization header is set correctly
  4. Restart your AI app after configuration changes

"Permission denied" on tools

Your role doesn't have the required permission. Connecting Claude requires mcp:connect (analyst and above — viewers do not have this). Specific tools require additional permissions:

  • list_serversmcp:connect (analyst and above)
  • queryquery:execute (analyst and above)
  • recoverrecover:execute (operator and above)
  • statusstatus:read (analyst and above)

Rate limiting

"Rate limit exceeded" (429)

You've exceeded the rate limit for your plan. Wait and retry, or upgrade your plan for higher limits.

PlanPer-user (rpm)Per-tenant (rpm)
Free60120
Pro200600
Premium6002,000
Enterprise2,00010,000

ProxySQL time-travel (Beta)

Errors you may see when running queries through dbtrail-shim (the MySQL wire-protocol server that handles _flashback, _diff, and _snapshot virtual schemas). See the ProxySQL time-travel guide for setup.

ERROR 2002 (HY000): Can't connect to dbtrail backend

The shim could not reach the dbtrail agent — DNS failure, connection refused, or the agent died mid-response. Check:

  1. The agent is running and reachable from the host the shim is on
  2. agent_url in shim.yaml is correct and includes the scheme (https://)
  3. Outbound TCP from the shim host to the agent's port (default 8600) is open
  4. The agent's TLS certificate is valid (or set the agent URL to http:// if you're testing)

The connection stays open after this error — the next query on the same connection can succeed once the agent recovers.

ERROR 1317 (70100): dbtrail backend returned HTTP <code>

The agent returned a 5xx. This is an agent-side problem, not a query problem — your SQL was well-formed and the request reached the agent. Check the agent logs for the underlying error and retry the query.

ERROR 1064 (HY000): dbtrail backend rejected request (HTTP <code>)

The agent returned a 4xx. The query reached the agent but was rejected — typically because the request shape was wrong (missing server_id, unknown table, malformed timestamp). The error body included in the message names the specific issue.

ERROR 1064 (HY000): dbtrail-shim: time-travel queries require a PK predicate

You ran a time-travel query without WHERE <pk> = …, WHERE <pk> IN (…), or WHERE <pk> BETWEEN … AND …. Time-travel queries are point/PK-batch oriented — without a PK predicate the agent would have to reconstruct every row at the target time. Add a PK filter:

SELECT * FROM _flashback.orders AS OF '2026-04-27 09:00' WHERE id = 42;

ERROR 1064 (HY000): dbtrail-shim: cannot join virtual schema … with a real table

The query joins a virtual schema (_flashback / _diff / _snapshot) with a real table in the same statement. The shim has no path to fetch real-schema rows during beta. Workarounds:

  • Run two queries client-side and join in your application
  • Materialize the virtual side into a subquery the application reads first

Federated planning (real + virtual JOINs in a single statement) is on the post-beta roadmap.

ERROR 1064 (HY000): dbtrail-shim: _diff window exceeds limit

Your _diff.<t> BETWEEN '<a>' AND '<b>' span is wider than limits.max_diff_window in shim.yaml (default 24h). Either narrow the range, raise the limit in shim config, or set @dbtrail_allow_wide_diff = ON for the session (parser support landing in a follow-up release).

ERROR 1815 (HY000): dbtrail-shim: <details>

Internal error — the agent returned success: false, the shim hit a recovered panic, or something we didn't categorize. Capture the full message and the SQL that triggered it and email support.

unknown tenant for MySQL user "<name>"

The MySQL user authenticated by ProxySQL is not listed in shim.yaml's tenants[]. Every user authorized to issue time-travel queries must appear in both ProxySQL's mysql_users and the shim's tenants[]. Add the user to shim.yaml and restart the shim.

Queries time out without an error

The shim does not impose its own per-query timeout — ProxySQL is the source of truth via mysql-default_query_timeout and mysql-connect_timeout_server. If your query is killed without a recognisable shim error, ProxySQL is the layer that gave up. Raise those values if your _diff window or recover range legitimately needs longer.

Getting help

If you can't resolve an issue using this guide, contact support at support@dbtrail.com with:

  • Your tenant ID
  • The server ID (if applicable)
  • A description of the issue
  • Any error messages you received

On this page