feat: Parser Enhancements Batch 6 - PostgreSQL UPSERT, Type Casting & Parameters #196

ajitpratap0 · 2025-12-13T07:04:23Z

Summary

This PR implements three parser enhancements for improved PostgreSQL compatibility:

Issue #195: Positional Parameters ($1, $2, etc.)

Tokenizer handles $ followed by digits as placeholders (e.g., $1, $10, $999)
Returns TokenTypePlaceholder with value like "$1"
Enables support for prepared statements and parameterized queries

Issue #188: PostgreSQL Type Casting (::)

Extended parseJSONExpression() to handle :: type cast operator
Added parseDataType() helper for parsing SQL data types
Added isDataTypeKeyword() to recognize SQL type keywords
Supports parameterized types: VARCHAR(100), NUMERIC(10,2)
Supports array type suffix: INTEGER[]
Reuses existing CastExpression AST node

Issue #193: INSERT ON CONFLICT (UPSERT)

Added parseOnConflictClause() function in dml.go
Supports conflict targets:
- Column list: ON CONFLICT (id, email)
- Constraint name: ON CONFLICT ON CONSTRAINT users_pkey
Supports actions:
- DO NOTHING
- DO UPDATE SET col = expr, ... with EXCLUDED pseudo-table support
- Optional WHERE clause in DO UPDATE
Works correctly with RETURNING clause

Test plan

All new tests pass (positional_params_test.go, type_casting_test.go, on_conflict_test.go)
Full test suite passes with race detection (go test -race ./pkg/...)
Pre-commit hooks pass (fmt, vet, short tests)

Examples

-- Positional parameters
SELECT * FROM users WHERE id = $1 AND status = $2

-- Type casting
SELECT name::VARCHAR(100), created_at::DATE FROM users
SELECT (data->>'score')::INTEGER FROM scores

-- INSERT ON CONFLICT
INSERT INTO users (id, name, email) 
VALUES (1, 'test', '[email protected]')
ON CONFLICT (id) 
DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email
WHERE users.active = true
RETURNING id, name

🤖 Generated with Claude Code

… Parameters Implements three parser enhancements for PostgreSQL compatibility: #195: Positional Parameters ($1, $2, etc.) - Tokenizer handles $ followed by digits as placeholders - Returns TokenTypePlaceholder with value like "$1" #188: PostgreSQL Type Casting (::) - parseJSONExpression() handles :: operator - Added parseDataType() and isDataTypeKeyword() helpers - Supports parameterized types (VARCHAR(100), NUMERIC(10,2)) - Supports array type suffix (INTEGER[]) - Reuses existing CastExpression AST node #193: INSERT ON CONFLICT (UPSERT) - Added parseOnConflictClause() function - Supports conflict target: (column_list) or ON CONSTRAINT name - Supports DO NOTHING action - Supports DO UPDATE SET with EXCLUDED pseudo-table - Supports optional WHERE clause in DO UPDATE - Works with RETURNING clause All tests pass with race detection (go test -race ./pkg/...). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

claude · 2025-12-13T07:04:35Z

Claude finished @ajitpratap0's task —— View job

Code Review Complete ✅

I've thoroughly reviewed this PR implementing PostgreSQL parser enhancements. Here's my comprehensive feedback:

Overall Assessment

This is a well-implemented PR that adds significant PostgreSQL compatibility features. The code quality is good and follows project patterns consistently.

✅ Strengths

1. Tokenizer Implementation (Positional Parameters)

Clean implementation in tokenizer.go:1433-1455 handling $1, $2, etc.
Proper edge case handling for leading zeros ($01) and large numbers ($999)
Follows existing tokenizer patterns and error handling
Good test coverage with comprehensive edge cases

2. Type Casting Implementation

Smart reuse of existing CastExpression AST node - excellent design choice
Well-structured parseDataType() function handling parameterized types
Proper precedence handling in parseJSONExpression()
Supports both CAST(expr AS type) and expr::type syntax consistently

3. ON CONFLICT Implementation

Comprehensive coverage of PostgreSQL UPSERT syntax
Clean separation of concerns in parseOnConflictClause()
Proper handling of both column targets and constraint names
Good support for EXCLUDED pseudo-table references

4. Test Coverage

Extensive test suites for all three features
Good edge case coverage (leading zeros, large numbers, chained casts)
Proper use of tokenizer pooling in tests (GetTokenizer()/PutTokenizer())

⚠️ Areas for Improvement

1. Memory Management Concerns

// pkg/sql/parser/expressions.go:491-550
func (p *Parser) parseDataType() (string, error) {
    // This builds strings through concatenation
    typeParams += p.currentToken.Literal  // Line 512, 525
    dataType += typeParams                // Line 531
    dataType += "[]"                      // Line 546

Issue: String concatenation can be inefficient for complex types
Recommendation: Consider using strings.Builder for building type strings, especially for types with multiple parameters

2. Error Context

// pkg/sql/parser/expressions.go:518-523
return "", goerrors.InvalidSyntaxError(
    "expected numeric type parameter",
    p.currentLocation(),
    "Use TYPE(precision[, scale]) syntax",
)

Issue: Generic error message doesn't indicate which type parameter failed
Recommendation: Include the actual token value in error messages for better debugging

3. Token Type Inconsistency

// pkg/sql/parser/expressions.go:517
if p.currentToken.Type != "INT" && !p.isType(models.TokenTypeIdentifier) && p.currentToken.Type != "NUMBER"

Issue: Mixing string literals ("INT", "NUMBER") with constants (models.TokenTypeIdentifier)
Recommendation: Use consistent token type constants throughout

4. Potential Edge Case

// pkg/sql/tokenizer/tokenizer.go:1453-1455
// Just a standalone $ symbol (could be dollar-quoted string start or error)
// For now, treat as placeholder
return models.Token{Type: models.TokenTypePlaceholder, Value: "$"}, nil

Issue: Standalone $ returns placeholder token, but PostgreSQL dollar-quoted strings ($tag$...$tag$) aren't supported
Recommendation: Add TODO comment about future dollar-quoted string support

🔒 Security Assessment

No security concerns identified
Input validation is properly handled through existing parser error mechanisms
No potential for injection attacks in the new parsing logic
DoS protection maintained through existing tokenizer limits

⚡ Performance Considerations

Good: Reuses existing AST nodes and parsing patterns
Good: Minimal impact on parsing performance for non-PostgreSQL syntax
Consider: String building in parseDataType() could be optimized for complex types
Good: Proper use of object pooling in tests

🧪 Test Quality

Excellent: Comprehensive test coverage for all features
Good: Follows project testing patterns with pooled tokenizers
Good: Tests both positive and edge cases
Minor: Consider adding benchmarks for type casting performance

📋 Minor Issues

1. Missing Token Constants
Some areas use string literals instead of defined constants - should be addressed for consistency.

2. Documentation
While the implementation is clear, consider adding package-level documentation for the new PostgreSQL features.

🏆 Recommendation

APPROVE - This PR successfully implements the requested PostgreSQL features with good code quality. The minor improvements suggested above can be addressed in follow-up PRs without blocking this implementation.

The implementation correctly follows the project's architectural patterns, maintains backward compatibility, and provides comprehensive test coverage.

Addresses review comments from Claude bot: 1. Memory Management: Replace string concatenation with strings.Builder in parseDataType() for efficient type string construction 2. Error Context: Include actual token value in error messages (e.g., "expected numeric type parameter, got 'foo'") 3. Token Type Consistency: Add isNumericLiteral() helper that uses token type constants instead of string literals ("INT", "NUMBER") 4. Dollar-Quoted Strings: Add comprehensive TODO comment explaining PostgreSQL dollar-quoted string syntax ($tag$...$tag$) and linking to issue #189 for future implementation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

claude · 2025-12-14T14:46:04Z

Claude finished @ajitpratap0's task —— View job

Pull Request Review - PostgreSQL Parser Enhancements

Reviewing PR #196 that implements PostgreSQL compatibility features:

Positional parameters ($1, $2, etc.)
Type casting with :: operator
INSERT ON CONFLICT (UPSERT) functionality

Todo List

Read and analyze all changed files
Evaluate code quality and best practices
Check for potential bugs or issues
Assess performance considerations
Review security concerns
Analyze test coverage
Provide comprehensive feedback

Code Review Summary ✅

This PR implements solid PostgreSQL compatibility features with good adherence to the existing codebase patterns. The implementation is well-structured and follows the project's architectural conventions.

✅ Strengths

1. Positional Parameters Implementation

Clean tokenizer logic in tokenizer.go:1433-1458 properly handles $1, $2, etc.
Good edge case handling for digit sequences and standalone $
Includes helpful TODO comment about future dollar-quoted string support
Proper UTF-8 handling with utf8.DecodeRune()

2. Type Casting Implementation

Excellent improvement: Now uses strings.Builder in parseDataType() (lines 497-550) for efficient string concatenation
Smart reuse of existing CastExpression AST node maintains consistency
Good support for parameterized types: VARCHAR(100), NUMERIC(10,2)
Proper array type suffix support: INTEGER[]
Fixed token type consistency issues - now uses proper constants

3. ON CONFLICT Implementation

Comprehensive PostgreSQL UPSERT syntax support in dml.go:600-679
Clean separation of target types (columns vs constraints)
Good support for EXCLUDED pseudo-table references
Proper integration with existing RETURNING clause handling

4. Test Coverage

Excellent: Three comprehensive test files with good edge case coverage
Tests follow project patterns with proper tokenizer pooling
Good range of scenarios from simple to complex cases
Tests validate both positive and error cases

⚠️ Areas for Improvement

1. Token Type Consistency (Minor)

// pkg/sql/parser/expressions.go:556-558
case "INT", "NUMBER", "FLOAT":
    return true

Still mixing string literals with constants in some places. Consider using models.TokenType* constants throughout for full consistency.

2. Error Message Enhancement

// pkg/sql/parser/expressions.go:520-524
return "", goerrors.InvalidSyntaxError(
    fmt.Sprintf("expected numeric type parameter, got '%s'", p.currentToken.Literal),
    p.currentLocation(),
    "Use TYPE(precision[, scale]) syntax",
)

Good improvement: Now includes the actual token value in error messages for better debugging context.

3. Memory Efficiency
The string building has been improved with strings.Builder, which addresses the previous performance concern about string concatenation.

🔒 Security Assessment

✅ No security issues identified
Input validation handled through existing parser error mechanisms
No injection vulnerabilities in new parsing logic
Proper bounds checking for tokenizer operations

⚡ Performance Considerations

✅ Good: Minimal performance impact on existing parsing paths
✅ Improved: strings.Builder usage for type construction eliminates inefficient concatenation
✅ Good: Reuses existing AST nodes and object pools
✅ Good: Tokenizer changes are O(1) for parameter detection

🧪 Test Quality Assessment

✅ Excellent: Comprehensive coverage across all three features
✅ Good: Proper edge case testing (leading zeros: $01, large numbers: $999)
✅ Good: Following project testing patterns with tokenizer pooling
✅ Good: Error case validation alongside success cases
Consider: Adding benchmarks for type casting performance in future iterations

📋 Code Quality

✅ Good: Follows existing codebase patterns and conventions
✅ Good: Proper error handling with contextual information
✅ Good: Clear function documentation and comments
✅ Improved: Better memory management with strings.Builder
✅ Good: Clean separation of concerns between tokenizer and parser

🏆 Recommendation: APPROVE ✅

This PR successfully implements the requested PostgreSQL features with good code quality and comprehensive testing. The implementation:

✅ Adds valuable PostgreSQL compatibility features
✅ Maintains backward compatibility
✅ Follows project architectural patterns
✅ Includes comprehensive test coverage
✅ Addresses previous review feedback (string building efficiency)
✅ No security or performance concerns

The minor token type consistency issues can be addressed in future cleanup PRs without blocking this implementation.

ajitpratap0 merged commit 91e6508 into main Dec 14, 2025
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Parser Enhancements Batch 6 - PostgreSQL UPSERT, Type Casting & Parameters #196

feat: Parser Enhancements Batch 6 - PostgreSQL UPSERT, Type Casting & Parameters #196

Uh oh!

ajitpratap0 commented Dec 13, 2025

Uh oh!

claude bot commented Dec 13, 2025 •

edited

Loading

Uh oh!

claude bot commented Dec 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Parser Enhancements Batch 6 - PostgreSQL UPSERT, Type Casting & Parameters #196

feat: Parser Enhancements Batch 6 - PostgreSQL UPSERT, Type Casting & Parameters #196

Uh oh!

Conversation

ajitpratap0 commented Dec 13, 2025

Summary

Issue #195: Positional Parameters ($1, $2, etc.)

Issue #188: PostgreSQL Type Casting (::)

Issue #193: INSERT ON CONFLICT (UPSERT)

Test plan

Examples

Uh oh!

claude bot commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete ✅

Overall Assessment

✅ Strengths

⚠️ Areas for Improvement

🔒 Security Assessment

⚡ Performance Considerations

🧪 Test Quality

📋 Minor Issues

🏆 Recommendation

Uh oh!

claude bot commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Review - PostgreSQL Parser Enhancements

Todo List

Code Review Summary ✅

✅ Strengths

⚠️ Areas for Improvement

🔒 Security Assessment

⚡ Performance Considerations

🧪 Test Quality Assessment

📋 Code Quality

🏆 Recommendation: APPROVE ✅

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Dec 13, 2025 •

edited

Loading

claude bot commented Dec 14, 2025 •

edited

Loading