You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Across 12 test cases, 7 passed and 5 failed, with high-severity defects concentrated in startup and metadata concurrency: a PR-introduced compile-time API mismatch in server/tables/dtables/ignore.go blocks server launch and ReadyForQuery, startup safety/readiness checks miss critical failure modes (configured data-dir guard order, fragile first-run default DB bootstrap DSN, and replication errors not gating readiness), and concurrent renames from the same source can both succeed. At the same time, protocol and metadata behavior was otherwise solid in this run, including correct bind type inference and same-session bind recovery, proper handling/recovery of injected default-database creation failure, expected rename-collision rejection, legacy collation normalization fallback, and coherent rollback/transactional rename semantics.
❌ Failed (1)
Category
Summary
Screenshot
Protocol
⚠️ Confirmed compile-time API mismatch prevents server startup, so StartupMessage handshake cannot reach ReadyForQuery.
⚠️ Startup handshake blocked by compile regression
What failed: The server build fails before process startup, so the connection never reaches authentication completion and ReadyForQuery; expected behavior is successful startup and handshake progression.
Impact: Core protocol workflows are unavailable because the server cannot start. Any pgwire test or user connection depending on startup readiness fails until the compile mismatch is fixed.
Steps to reproduce:
Build and start the server using the current dependency set from go.mod.
Observe the build error from server/tables/dtables/ignore.go where GetStringAdaptiveValue is called with an outdated argument list.
Attempt a StartupMessage connection to localhost:5432 and confirm ReadyForQuery is never reached because startup did not complete.
Stub / mock context: SCRAM authentication was temporarily bypassed for local protocol testing, but the observed failure happens earlier at compile time: server/tables/dtables/ignore.go uses an outdated GetStringAdaptiveValue call signature after dependency upgrades.
Code analysis: I inspected server/tables/dtables/ignore.go, server/connection_handler.go, and dependency bump changes in go.mod. getIgnoreTablePatternKey still calls GetStringAdaptiveValue with an outdated argument list, which aligns with the build failure evidence; startup handshake logic itself appears intact but is unreachable because compilation fails first.
Why this is likely a bug: The production source has a deterministic compile-time signature mismatch in a tracked Go file, which prevents server startup and therefore blocks the primary protocol handshake path.
Relevant code:
server/tables/dtables/ignore.go (lines 61-66)
// getIgnoreTablePatternKey reads the pattern key from a tuple and returns it.funcgetIgnoreTablePatternKey(ctx context.Context, keyDesc*val.TupleDesc, keyTuple val.Tuple) (string, error) {
key, ok, err:=keyDesc.GetStringAdaptiveValue(0, nil, keyTuple)
iferr!=nil {
return"", err
}
Parse/Bind fallback correctly infers bind variable types when ParameterOIDs are omitted or zero.
Engine
Injected default-database creation failure was surfaced, and restart without injection recovered cleanly.
Engine
First bind failed as expected, then corrected bind on the same prepared statement succeeded (42) without reconnect.
Metadata
ALTER TABLE employees RENAME TO teams returned a table-exists error and both original tables remained accessible, matching expected collision behavior.
N/A
Metadata
Previous BLOCKED status was due to missing fixture, not product logic. Source confirms normalization is implemented: when serialized collation is invalid, default collation is returned, matching expected compatibility behavior.
N/A
Metadata
Injected failure produced clean operation failure without partial rename divergence, and retry behavior remained deterministic, satisfying expected atomic/coherent behavior under fault.
N/A
Metadata
Dependent DDL and DML in the same transaction resolved against the new table identity consistently, with commit producing coherent catalog objects and data.
N/A
ℹ️ Additional Findings (4)
These findings are unrelated to the current changes but were observed during testing.
Category
Summary
Screenshot
Engine
⚠️ Startup does not reject an existing configured data directory because the guard runs before the data-dir reload.
Engine
⚠️ First-run default database bootstrap can fail because the bootstrap connection does not target a guaranteed-existing database.
Engine
⚠️ Replication startup errors do not gate readiness; server remains queryable after unreachable downstream failure.
Metadata
⚠️ Confirmed real application bug. Concurrent rename operations can both succeed because rename validation/editing occurs against each session root without a guard that forces one operation to fail, matching the previously observed both-success outcome.
⚠️ Startup guard misses configured data directory
What failed: Startup should abort when the configured target data directory already contains a Dolt/Doltgres repository, but the server continues startup because the early guard checks the initial cwd-bound environment instead of the configured data directory.
Impact: A core startup safety check is bypassed for configured data directories, so invalid startup states are accepted. Users can unintentionally boot against an existing database directory when startup should fail fast.
Steps to reproduce:
Set DOLTGRES_DATA_DIR to a path that already contains an initialized Dolt/Doltgres repository.
Launch Doltgres from a different current working directory.
Observe that startup proceeds instead of aborting for the existing configured data directory.
Stub / mock context: SCRAM authentication was intentionally bypassed so startup behavior could be tested without auth gating, and startup ran with local harness-driven process control instead of full production auth flow. A local compatibility patch in server/tables/dtables/ignore.go and a generated auth.db fixture were also present during execution.
Code analysis: I reviewed server/server.go and found the HasDoltDataDir check is executed before dEnv is reloaded to ssCfg.DataDir(), so the guard evaluates the wrong filesystem context.
Why this is likely a bug: The guard that should validate the configured data directory runs before switching to that directory, so it can miss the exact invalid condition the test targets.
Relevant code:
server/server.go (lines 93-97)
ifdEnv.HasDoltDataDir() {
cwd, _:=dEnv.FS.Abs(".")
returnnil, errors.Errorf("Cannot start a server within a directory containing a Dolt or Doltgres database. "+"To use the current directory (%s) as a database, start the server from the parent directory.", cwd)
}
server/server.go (lines 132-139)
// Reload the dolt environment with the correct data dir that was specified in the configuration.// This initial dEnv instance is loaded for the current working directory.dataDirFs, err:=dEnv.FS.WithWorkingDir(ssCfg.DataDir())
iferr!=nil {
returnnil, err
}
dEnv=env.Load(ctx, dEnv.GetUserHomeDir, dataDirFs, doltdb.LocalDirDoltDB, dEnv.Version)
⚠️ First run bootstrap database connection is fragile
What failed: The first-run bootstrap path can fail while creating the default database because the internal bootstrap connection string omits a guaranteed-existing database context.
Impact: New instances can fail during initial database bootstrap, breaking first-run startup for a core workflow. This can leave users unable to bring up a fresh server without manual intervention.
Steps to reproduce:
Create a fresh empty data directory and set DOLTGRES_DB to postgres.
Start the server and let it enter first-run initialization.
Observe bootstrap failure while creating the default database.
Stub / mock context: SCRAM authentication was intentionally bypassed so startup initialization could be exercised without authentication-handshake failures masking bootstrap behavior. A local compatibility patch in server/tables/dtables/ignore.go and a generated auth.db fixture were also present to stabilize the run environment.
Code analysis: In server/server.go, first-run logic calls createDefaultDatabase after startup. That function builds a DSN without a database name and then executes CREATE DATABASE for the default DB, which can fail when no suitable initial database context exists.
Why this is likely a bug: The bootstrap connection omits an explicit known-existing database, making first-run database creation fragile in the exact path meant to initialize a fresh environment.
funccreateDefaultDatabase(cfg doltservercfg.ServerConfig) error {
user, password:=auth.GetSuperUserAndPassword()
dbName:=getDefaultDatabaseName(user)
dsn:=fmt.Sprintf("postgres://%s:%s@localhost:%d", user, password, cfg.Port())
// Connect to the server and create the default database with the given name.ctx:=context.Background()
conn, err:=pgx.Connect(ctx, dsn)
iferr!=nil {
returnerr
}
deferconn.Close(ctx)
_, err=conn.Exec(ctx, fmt.Sprintf("CREATE DATABASE %s;", dbName))
returnerr
}
⚠️ Startup readiness ignores replication failure
What failed: The server is expected to fail startup (or become unavailable) when replication cannot establish a downstream connection, but it continues accepting SQL traffic in a half-ready state.
Impact: Operators can see a "started" server that is silently running without healthy replication, which risks inconsistent behavior and delayed incident detection. There is no practical workaround for users who rely on startup readiness to guarantee replication health.
Steps to reproduce:
Configure replication with valid syntax but an unreachable downstream host and port.
Start the server and wait for the normal startup-ready signal.
Run a SQL query after replication connection errors occur and observe that queries still succeed.
Stub / mock context: Local SCRAM authentication was bypassed for deterministic access, and replication was intentionally pointed to an unreachable local endpoint to validate readiness behavior under real connection failure.
Code analysis: I inspected server/server.go and server/logrepl/replication.go; startup launches replication in a background goroutine without awaiting its result, while replication connection/runtime errors are returned only inside the goroutine and never propagated to the startup success path.
Why this is likely a bug: Replication failures are real runtime errors in production code, but the startup flow detaches and ignores them, allowing readiness to report success when a required subsystem has already failed.
Relevant code:
server/server.go (lines 188-192)
// TODO: shutdown replication cleanly when we stop the server_, err=startReplication(cfg, ssCfg)
iferr!=nil {
returnnil, err
}
What failed: Both rename commands can return success, producing both destination names from one original source when exactly one rename should win and the other should fail.
Impact: Concurrent schema changes can produce invalid catalog outcomes and break assumptions that rename is conflict-safe. Teams running parallel migrations may end up with contradictory table state without a reliable failure signal.
Steps to reproduce:
Open two SQL sessions against the same database.
Run ALTER TABLE t_old RENAME TO t_a in one session and ALTER TABLE t_old RENAME TO t_b in the other at nearly the same time.
Check resulting table names after both statements complete.
Observe both destination tables exist even though only one rename should succeed.
Stub / mock context: Real SCRAM password verification was bypassed to keep local sessions deterministic, then the concurrent rename workflow was executed against the same metadata path used in production code. No table-rename result was stubbed or mocked.
Code analysis: I inspected rename orchestration in core/rootvalue.go and table-map mutation logic in core/storage/storage.go. RenameTable delegates directly to EditTablesMap, and EditTablesMap performs local existence checks plus delete/update without any compare-and-swap or cross-session conflict guard that would force one concurrent rename to fail.
Why this is likely a bug: The rename flow applies check-then-mutate logic on session-local state without a synchronization guard, so concurrent operations can both pass preconditions and commit incompatible outcomes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
☕ An Automated Dependency Version Bump PR 👑
Initial Changes
The changes contained in this PR were produced by `go get`ing the dependency.
```bash
go get github.com/dolthub/[dependency]/go@[commit]
```